Toggle navigation
Toggle navigation
This project
Loading...
Sign in
万朱浩
/
Venue-Ops
Go to a project
Toggle navigation
Projects
Groups
Snippets
Help
Toggle navigation pinning
Project
Activity
Repository
Pipelines
Graphs
Issues
0
Merge Requests
0
Wiki
Network
Create a new issue
Builds
Commits
Authored by
戒酒的李白
2024-10-06 11:34:31 +0800
Browse Files
Options
Browse Files
Download
Email Patches
Plain Diff
Commit
f5e307d3f80999cb047c5d46ed2833dc4e688df7
f5e307d3
1 parent
ee739c3c
Define the linear transformation layer
Show whitespace changes
Inline
Side-by-side
Showing
1 changed file
with
6 additions
and
1 deletions
model_pro/MHA.py
model_pro/MHA.py
View file @
f5e307d
...
...
@@ -10,9 +10,14 @@ class MultiHeadAttentionLayer(nn.Module):
assert
(
self
.
head_dim
*
num_heads
==
embed_size
),
"Embedding size needs to be divisible by num_heads"
# Define linear layers for Q, K, V
self
.
q_linear
=
nn
.
Linear
(
embed_size
,
embed_size
)
self
.
k_linear
=
nn
.
Linear
(
embed_size
,
embed_size
)
self
.
v_linear
=
nn
.
Linear
(
embed_size
,
embed_size
)
if
__name__
==
"__main__"
:
embed_size
=
512
num_heads
=
8
mha_layer
=
MultiHeadAttentionLayer
(
embed_size
,
num_heads
)
print
(
"
Model initialized successfully
."
)
print
(
"
Linear layers for Q, K, V initialized
."
)
...
...
Please
register
or
login
to post a comment