Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Not allowing + in an email field is one of my pet peeves. Congrats on finding an amazing-looking regex for email validation instead of thinking about it.


Yup. The most email validation I evern implement is "there must be an @ sign with stuff before and after the @ sign". Maybe require a dot in the latter space.


I know lots of developers (myself included) who thought they could write a functional email regex. Eventually you learn that in the end, the way to validate an email address is to send an email.


> Eventually you learn that in the end, the way to validate an email address is to send an email.

In my experience as a user, most people in the end also validate the email address by sending an email, but do no learning.


Strictly speaking, the dot in the latter space isn't actually necessary. It needs to be a resolvable domain, but if you bought a TLD you could be "name@tld"


I don't think an @domain is required if it's an email address on the same mail server.


That certainly used to work on Gmail; I haven't tried recently.

(Obvious caveat being that any provider can trivially append @provid.er if you don't specify, so this isn't really proof that email is specified that way.)


Wow, that is literally the only reason I would want a TLD


ICANN forbids application records at the apex of a TLD. The only TLDs that do funky things like this are ccTLDs.

And if you try to actually use a bare TLD you will crash hard into collisions with unqualifed domains and search paths. So, fun as a hack to amuse fellow techies but not useful for anything real.


...is there someone out there who did this?


Not that I'm aware of. Although I wouldn't be surprised if google did it for their employees. They already own the google TLD, they could very easily make it point to the google.com emails.


How many HN-reading googlers have you just made to send 'Test' emails to themselves, I wonder!


Can you actually attach an MX record to a tld?



Huh, cool. Though I don't see a lot of those records myself. For example, `dig MX ai.` gives me the MX record for ai., but I can't see any MX records for as., bj., or dj.


.+@.+\..+


That regex is defective. See benchaney's reply above.


.+@.+(\..+)?

should satisfy you both right? I use the original or a form of usually when I had to check emails.


That's identical to just doing .+@.+


Fully qualified domains actually end in a . If you're allowing sally@google then you must also allow sally@google. To be valid


No you don't. Email addresses explicitly require any periods in the domain to have at least one (non-period) character after the period. From RFC 5322, the relevant grammar production for the domain looks like

  dot-atom-text   =   1*atext *("." 1*atext)
(where atext is letters, digits, or a set of specific punctuation characters that doesn't include periods).


Ah, I wasn't aware that email addresses defined that differently. it looks like it's also the same way as part of obs-domain as defined by the addr-spec part of that. RFC 822 also seems to say the same thing, it's been way too long since i've tried to read those RFCs.


One (probably forgotten) reason people validated email addresses with convoluted regexps is that there are multiple (actually valid per RFC) email addresses formats that can do nasty things. Like explicitly specifying a series of mail servers to go through. They're (hopefully) deprecated and rejected by most servers nowadays, though.


> hopefully

Why?


> By default, this feature is turned off. This closes a nasty open relay loophole where a backup MX host can be tricked into forwarding junk mail to a primary MX host which then spams it out to the world.

http://www.postfix.org/postconf.5.html#allow_untrusted_routi...


Thankfully my email is @@@@@@@.......


Allowing + in an email/password field, but failing to properly encode/decode it, is even more infuriating.


well, string+{whatever1}@gmail.com mails are redirected to string@gmail.com. So a user can just open account for string@gmail.com and then use that to open thousands of user accounts on the site that allows "+" on the email field. (I used to do this on sites that allowed limited number of free downloads after which you had to pay)


I recently started adding random emojis and lyrics to gmail addresses, it isn't usually shown so it's a nearly-steganographic rick-roll.

example+Youre+the+best!+Around!+Nothings+gonna+ever+keep+you+down@gmail.com

Too bad the music emoji is stripped by HN!

PS: Emoji in titles also make things like Youtube videos and blog posts pop quite a bit when shared via social media.


But that's not a feature of email in general. That's a feature of gmail, and a few other email providers that have followed suit. For email in general, foo@bar.com and foo+baz@bar.com are distinct emails that may go to different users.


I wouldn't say "a few other email providers that have followed suit". As far as I know, that's always been a feature of Postfix, and common with Sendmail before that. And qmail offered "-" for the same purpose (as well as providing a facility for filtering/redirecting mail based on -extensions).


The standard (as in most common setup) for djb's qmail, was to accept/forward mail for <prefix>@user.example.com to user@example.com, as in: debian-list@e12e.example.com rather than debian-list+e12e@example.com.

While I believe the + was common for postfix, exim and sendmail?

There even was a spam-fighting scheme that used this - it took a key, and optional date, and a sender address - and generated a unique address you could give out, that was only valid for a certain time, for a certain set of senders.

If the system received a mail with a from address that didn't match the cryptologically signed to-address (e12e-xjjgff65477fc@example.com) - the mail was held back, and the system generated a reply, with a signed reply-to address. A sort of manual grey-listing.


The point is that foo+bar is a convention that many providers follow and not an actual rule of email, and therefore blocking that doesn't make sense because you'll block legitimate addresses too.

Similarly, mail providers can come up with all sorts of different conventions if they want. For example, when setting up a new domain in FastMail, it offers the ability to accept anything@user.dom.ain and turn that into user+anything@dom.ain, and it offers user@anything.dom.ain which it will deliver to user@dom.ain. So here we already have two new conventions that sites can't possibly detect as alternatives to the normal foo+bar@dom.ain.


That is fair, but maybe the providers just want to stop account spamming on their sites, can you blame them for that?


Depending on your email provider, there are a ton of different ways to create brand new email addresses. Blocking off the single pattern of foo+bar@example.com just punishes legitimate users without causing any problem for spammers. Besides the fact that foo+bar@example.com may in fact be a legitimate unique email address, there's also just the case of people who routinely use username+sitename@example.com when signing up for anything as a way of tracking where incoming email is coming from.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: