Follow

someone told me anything that's a unicode letter is valid in python identifiers

y'all there are *so many* unicode letters

"ː3" (using the triangular colon from IPA) is a valid identifier in python

so is "fuckǃ" (using the 'retroflex click' symbol)

@ontploffing weirdly, in my limited experience, it depends on the writing system the rtl character comes from. i'll describe my findings in better detail tomorrow

@typhlosion I think identifiers in programming languages must be limited to a subset of ASCII. Any use of non-ASCII unicode characters outside comments and strings should be forbidden.
Because it leads to errors as many many unicode characters look alike, while having different codepoint.

A nasty joke to make on a programmer you don't like ? Substituing a semicolon with a greek question mark.

@Feufochmar i'm glad programming language designers generally aren't sticks in the mud like you are

@typhlosion Cool feature does not mean nice feature from a maintenance point of view. Generalized Unicode identifiers are cool, not nice, especially when you do not work alone on your code.

Code should be easy to read, understand, and modify. There is too much potential abuse with unrestricted Unicode.

A nicer feature would be to allow the programmer to indicate which set of letters are used for identifiers in a source file. So that you are not limited to Latin.

@Feufochmar there's potential for abuse with just ascii too. i could give all my identifiers max-length base 64 uuids and make similarly awful code. either way, the responsibility for making the code legible and sensible is up to the coder. that responsibility doesn't go away just because they have more characters to work with

@typhlosion The base64 uuids abuses are easier to spot than mixed latin/cyrilllic/greek identifiers.

@Feufochmar so? the point remains: it's up to the programmer to employ good coding practices, and it's up to the people around the coder to take action about bad ones, irrespective of the programming language in question. either way, the person who goes out of their way to make bad code is not someone you want on your project. in sum, i don't think this is something the language itself should have an opinion on.

@Feufochmar (if you really want to make sure stuff like that is kept out of your projects, run pull requests and commits through a linter rather than asking to rob an entire programming language of potentially useful functionality)

@typhlosion
One of the few legit use of this fonctionnality I see is allowing writing code in other languages than English. The functionnality I proposed allows it too, but forces the programmer to tell which unicode block they wants to use.

However, most coding rules say to write code in English, notably when working in teams, or companies.

The linter should be part of the compiler, or provided with the language tools, as most of the time, one would be required for checking coding rules.

Sign in to participate in the conversation
Awoo Space

Awoo.space is a Mastodon instance where members can rely on a team of moderators to help resolve conflict, and limits federation with other instances using a specific access list to minimize abuse.

While mature content is allowed here, we strongly believe in being able to choose to engage with content on your own terms, so please make sure to put mature and potentially sensitive content behind the CW feature with enough description that people know what it's about.

Before signing up, please read our community guidelines. While it's a very broad swath of topics it covers, please do your best! We believe that as long as you're putting forth genuine effort to limit harm you might cause – even if you haven't read the document – you'll be okay!