Let’s start with a regular BM with a known drift *rate* denoted “m”, and known variance parameter value, denoted “s”:
dX = m dt + s dBt
In other words,
Xt – X0 = m*t + s*Bt
Here, “… + Bt” has a non-trivial meaning. It is not same as adding two numbers or adding two variables, but rather a signal-noise formula… It describes a Process, with a non-random, deterministic part, and a random part whose variance at time t is equal to (s2 t)
Next, we construct or encounter a random process G(t) related but not derived from this BM:
dG/G = m dt + s dBt …….. 
It turns out this process can be exactly described as
G = G0exp[ (m- ½ s2)t + s Bt ] ………. 
Again, the simple-looking “… + s Bt” expression has a non-trivial meaning. It describes a Process, whose log value has a deterministic component, and a random component whose variance is (s2 t).
Note in the formula above (m- ½ s2) isn’t the drift of GBM process G(t), because left hand side is “dG / G” rather than dG itself.
In contrast, (m- ½ s2) is a drift rate in the “log” process L(t) := log G(t). This log process is a BM.
dL = (m – ½ s2) dt + s dBt …… 
If we compare  vs , we see the drift rate eroded by (½ s2).
(You may feel dL =?= dG/G but that’s “before” Ito. Since G(t) is an Ito process, to get dL we must apply Ito’s and we end up with .)
I wish there’s only one form to remember, but unfortunately,  and  are both used extensively.
* Starting from a BM with drift = (u) dt
** the exponential process Y(t) derived from the BM has drift (not drift rate)
= [u + ½ s2 ] Y(t) dt
* Starting from a GBM (Not something derived from BM) process with drift (not drift rate) = m* G(t) dt
** the log process L(t), derived from the GBM process, is a BM with drift
= (m – ½ s2) dt, not “…L(t) dt”