Word with two capital letters

Someone in one of my online editing groups wanted to find all the acronyms and initialisms in their document—any word comprising two or more capital (‘cap’) letters (e.g. AB, CDEF, GHIJK, etc.). They wanted a command that would find each one so they could check it (possibly against a glossary), then click Find Next to jump to the next one.

Wildcards to the rescue!

Here’s how:

  1. Press Ctrl+H to open the Find and Replace window.
  2. Click the Find tab (we only want to find these, not replace them with anything else).
  3. Click More to show further options.
  4. Select the Use wildcards checkbox.
  5. In the Find what field, type: <[A-Z]{2,}>
  6. Click Find next to find the first string of two or more caps.
  7. Keep clicking Find next to jump to the next string of two or more caps.

How this works:

  • The opening and closing arrow brackets (< and >) specify that you want a single whole word, not parts of a word. Without these, you would find each set of caps (e.g. in the string ABCDEF, you would find ABCDEF, then BCDEF, then CDEF, then DEF, then EF, before moving on to the next set of caps).
  • [A-Z] specifies that you want a range (the [ ] part) of caps that fall somewhere in the alphabet (A-Z). If you only wanted capped words that started with, say, H through to M, then you’d change the range to [H-M] and all other capped words starting with other letters would be ignored.
  • {2,} means you want to find capped words with at least two letters in the specified range (i.e. A-Z). If you only wanted to find two- and three-letter capped words, then you’d change this to {2,3}, and all capped word of four or more letters would be ignored. By not specifying a number after the comma, the ‘find’ will find capped words of any length containing at least two letters.

Camel case is named after the «hump» of its protruding capital letter, similar to the hump of common camels.

Camel case (sometimes stylized as camelCase or CamelCase, also known as camel caps or more formally as medial capitals) is the practice of writing phrases without spaces or punctuation and with capitalized words. The format indicates the first word starting with either case, then the following words having an initial uppercase letter. Common examples include «YouTube», «iPhone» and «eBay». Camel case is often used as a naming convention in computer programming. It is also sometimes used in online usernames such as «JohnSmith», and to make multi-word domain names more legible, for example in promoting «EasyWidgetCompany.com».

The more specific terms Pascal case and upper camel case refer to a joined phrase where the first letter of each word is capitalized, including the initial letter of the first word. Similarly, lower camel case (also known as dromedary case) requires an initial lowercase letter. Some people and organizations, notably Microsoft, use the term camel case only for lower camel case, designating Pascal case for the upper camel case.[1] Some programming styles prefer camel case with the first letter capitalized, others not.[2][1][3] For clarity, this article leaves the definition of camel case ambiguous with respect to capitalization, and uses the more specific terms when necessary.

Camel case is distinct from title case, which capitalizes all words but retains the spaces between them, and from Tall Man lettering, which uses capitals to emphasize the differences between similar-looking product names such as «predniSONE» and «predniSOLONE». Camel case is also distinct from snake case, which uses underscores interspersed with lowercase letters (sometimes with the first letter capitalized). A combination of snake and camel case (identifiers Written_Like_This) is recommended in the Ada 95 style guide.[4]

Variations and synonyms[edit]

The practice has various names, including:

  • camelBack (or camel-back) notation[5] or CamelCaps[6]
  • camel case or CamelCase
  • CapitalizedWords or CapWords for upper camel case in Python[7]
  • compoundNames[8]
  • Embedded caps (or embedded capitals)[9]
  • HumpBack (or hump-back) notation[10]
  • InterCaps or intercapping[11] (abbreviation of Internal Capitalization[12])
  • medial capitals, recommended by the Oxford English Dictionary[13]
  • mixedCase for lower camel case in Python[7]
  • PascalCase for upper camel case[14][15][16] (after the Pascal programming language)
  • Smalltalk case
  • WikiWord[17] or WikiCase[18] (especially in older wikis)

The earliest known occurrence of the term «InterCaps» on Usenet is in an April 1990 post to the group alt.folklore.computers by Avi Rappoport.[19] The earliest use of the name «Camel Case» occurs in 1995, in a post by Newton Love.[20] Love has since said, «With the advent of programming languages having these sorts of constructs, the humpiness of the style made me call it HumpyCase at first, before I settled on CamelCase. I had been calling it CamelCase for years. … The citation above was just the first time I had used the name on USENET.»[21]

Traditional use in natural language[edit]

In word combinations[edit]

The use of medial capitals as a convention in the regular spelling of everyday texts is rare, but is used in some languages as a solution to particular problems which arise when two words or segments are combined.

In Italian, pronouns can be suffixed to verbs, and because the honorific form of second-person pronouns is capitalized, this can produce a sentence like non ho trovato il tempo di risponderLe («I have not found time to answer you» – where Le means «to you»).

In German, the medial capital letter I, called Binnen-I, is sometimes used in a word like StudentInnen («students») to indicate that both Studenten («male students») and Studentinnen («female students») are intended simultaneously. However, mid-word capitalization does not conform to German orthography apart from proper names like McDonald; the previous example could be correctly written using parentheses as Student(inn)en, analogous to «congress(wo)men» in English.[22]

In Irish, camel case is used when an inflectional prefix is attached to a proper noun, for example i nGaillimh («in Galway»), from Gaillimh («Galway»); an tAlbanach («the Scottish person»), from Albanach («Scottish person»); and go hÉirinn («to Ireland»), from Éire («Ireland»). In recent Scottish Gaelic orthography, a hyphen has been inserted: an t-Albannach.

This convention is also used by several written Bantu languages (e.g. isiZulu, «Zulu language») and several indigenous languages of Mexico (e.g. Nahuatl, Totonacan, Mixe–Zoque, and some Oto-Manguean languages).

In Dutch, when capitalizing the digraph ij, both the letter I and the letter J are capitalized, for example in the country name IJsland («Iceland»).

In Chinese pinyin, camel case is sometimes used for place names so that readers can more easily pick out the different parts of the name. For example, places like Beijing (北京), Qinhuangdao (秦皇岛), and Daxing’anling (大兴安岭) can be written as BeiJing, QinHuangDao, and DaXingAnLing respectively, with the number of capital letters equaling the number of Chinese characters. Writing word compounds only by the initial letter of each character is also acceptable in some cases, so Beijing can be written as BJ, Qinghuangdao as QHD, and Daxing’anling as DXAL.

In English, medial capitals are usually only found in Scottish or Irish «Mac-» or «Mc-» names, where for example MacDonald, McDonald, and Macdonald are common spelling variants of the same name, and in Anglo-Norman «Fitz-» names, where for example both FitzGerald and Fitzgerald are found.

In their English style guide The King’s English, first published in 1906, H. W. and F. G. Fowler suggested that medial capitals could be used in triple compound words where hyphens would cause ambiguity—the examples they give are KingMark-like (as against King Mark-like) and Anglo-SouthAmerican (as against Anglo-South American). However, they described the system as «too hopelessly contrary to use at present».[23]

In transliterations[edit]

In the scholarly transliteration of languages written in other scripts, medial capitals are used in similar situations. For example, in transliterated Hebrew, ha’Ivri means «the Hebrew person» or «the Jew» and b’Yerushalayim means «in Jerusalem». In Tibetan proper names like rLobsang, the «r» stands for a prefix glyph in the original script that functions as tone marker rather than a normal letter. Another example is tsIurku, a Latin transcription of the Chechen term for the capping stone of the characteristic Medieval defensive towers of Chechnya and Ingushetia; the letter «I» (palochka) is not actually capital, denoting a phoneme distinct from the one transcribed as «i».

In abbreviations[edit]

Medial capitals are traditionally used in abbreviations to reflect the capitalization that the words would have when written out in full, for example in the academic titles PhD or BSc. A more recent example is NaNoWriMo, a contraction of National Novel Writing Month and the designation for both the annual event and the nonprofit organization that runs it. In German, the names of statutes are abbreviated using embedded capitals, e.g. StGB for Strafgesetzbuch (Criminal Code), PatG for Patentgesetz (Patent Act), BVerfG for Bundesverfassungsgericht (Federal Constitutional Court), or the very common GmbH, for Gesellschaft mit beschränkter Haftung (private limited company). In this context, there can even be three or more camel case capitals, e.g. in TzBfG for Teilzeit- und Befristungsgesetz (Act on Part-Time and Limited Term Occupations). In French, camel case acronyms such as OuLiPo (1960) were favored for a time as alternatives to initialisms.

Camel case is often used to transliterate initialisms into alphabets where two letters may be required to represent a single character of the original alphabet, e.g., DShK from Cyrillic ДШК.

History of modern technical use[edit]

Chemical formulas[edit]

The first systematic and widespread use of medial capitals for technical purposes was the notation for chemical formulas invented by the Swedish chemist Jacob Berzelius in 1813. To replace the multitude of naming and symbol conventions used by chemists until that time, he proposed to indicate each chemical element by a symbol of one or two letters, the first one being capitalized. The capitalization allowed formulas like «NaCl» to be written without spaces and still be parsed without ambiguity.[24][25]

Berzelius’ system continues to be used, augmented with three-letter symbols such as «Uue» for unconfirmed or unknown elements and abbreviations for some common substituents (especially in the field of organic chemistry, for instance «Et» for «ethyl-«). This has been further extended to describe the amino acid sequences of proteins and other similar domains.

Early use in trademarks[edit]

Since the early 20th century, medial capitals have occasionally been used for corporate names and product trademarks, such as

  • DryIce Corporation (1925) marketed the solid form of carbon dioxide (CO2) as «Dry Ice», thus leading to its common name.[26]
  • CinemaScope and VistaVision, rival widescreen movie formats (1953)
  • ShopKo (1962), retail stores, later renamed Shopko
  • MisteRogers Neighborhood, the TV series also called Mister Rogers’ Neighborhood (1968)[27]
  • ChemGrass (1965), later renamed AstroTurf (1967)
  • ConAgra (1971), formerly Consolidated Mills
  • MasterCraft (1968), a sports boat manufacturer
  • AeroVironment (1971)
  • PolyGram (1972), formerly Grammophon-Philips Group
  • United HealthCare (1977)[28]
  • MasterCard (1979), formerly Master Charge
  • SportsCenter (1979)

Computer programming[edit]

In the 1970s and 1980s, medial capitals were adopted as a standard or alternative naming convention for multi-word identifiers in several programming languages. The precise origin of the convention in computer programming has not yet been settled. A 1954 conference proceedings[29] occasionally informally referred to IBM’s Speedcoding system as «SpeedCo». Christopher Strachey’s paper on GPM (1965),[30] shows a program that includes some medial capital identifiers, including «NextCh» and «WriteSymbol«.

Multiple-word descriptive identifiers with embedded spaces such as end of file or char table cannot be used in most programming languages because the spaces between the words would be parsed as delimiters between tokens. The alternative of running the words together as in endoffile or chartable is difficult to understand and possibly misleading; for example, chartable is an English word (able to be charted), whereas charTable means a table of chars .

Some early programming languages, notably Lisp (1958) and COBOL (1959), addressed this problem by allowing a hyphen («-«) to be used between words of compound identifiers, as in «END-OF-FILE»: Lisp because it worked well with prefix notation (a Lisp parser would not treat a hyphen in the middle of a symbol as a subtraction operator) and COBOL because its operators were individual English words. This convention remains in use in these languages, and is also common in program names entered on a command line, as in Unix.

However, this solution was not adequate for mathematically-oriented languages such as FORTRAN (1955) and ALGOL (1958), which used the hyphen as an infix subtraction operator. FORTRAN ignored blanks altogether, so programmers could use embedded spaces in variable names. However, this feature was not very useful since the early versions of the language restricted identifiers to no more than six characters.

Exacerbating the problem, common punched card character sets of the time were uppercase only and lacked other special characters. It was only in the late 1960s that the widespread adoption of the ASCII character set made both lowercase and the underscore character _ universally available. Some languages, notably C, promptly adopted underscores as word separators, and identifiers such as end_of_file are still prevalent in C programs and libraries (as well as in later languages influenced by C, such as Perl and Python). However, some languages and programmers chose to avoid underscores—among other reasons to prevent confusing them with whitespace—and adopted camel case instead.

Charles Simonyi, who worked at Xerox PARC in the 1970s and later oversaw the creation of Microsoft’s Office suite of applications, invented and taught the use of Hungarian Notation, one version of which uses the lowercase letter(s) at the start of a (capitalized) variable name to denote its type. One account[citation needed] claims that the camel case style first became popular at Xerox PARC around 1978, with the Mesa programming language developed for the Xerox Alto computer. This machine lacked an underscore key (whose place was taken by a left arrow «←»), and the hyphen and space characters were not permitted in identifiers, leaving camel case as the only viable scheme for readable multiword names. The PARC Mesa Language Manual (1979) included a coding standard with specific rules for upper and lower camel case that was strictly followed by the Mesa libraries and the Alto operating system. Niklaus Wirth, the inventor of Pascal, came to appreciate camel case during a sabbatical at PARC and used it in Modula, his next programming language.[31]

The Smalltalk language, which was developed originally on the Alto, also uses camel case instead of underscores. This language became quite popular in the early 1980s, and thus may also have been instrumental in spreading the style outside PARC.

Upper camel case (or «Pascal case») is used in Wolfram Language in computer algebraic system Mathematica for predefined identifiers. User defined identifiers should start with a lower case letter. This avoids the conflict between predefined and user defined identifiers both today and in all future versions.

Computer companies and products[edit]

Whatever its origins in the computing field, the convention was used in the names of computer companies and their commercial brands, since the late 1970s — a trend that continues to this day:

  • (1977) CompuServe
  • (1978) WordStar
  • (1979) VisiCalc
  • (1982) MicroProse, WordPerfect
  • (1983) NetWare
  • (1984) LaserJet, MacWorks, PostScript
  • (1985) PageMaker
  • (1987) ClarisWorks, HyperCard, PowerPoint
  • (1990) WorldWideWeb (the first web browser), later renamed Nexus

Spread to mainstream usage[edit]

In the 1980s and 1990s, after the advent of the personal computer exposed hacker culture to the world, camel case then became fashionable for corporate trade names in non-computer fields as well. Mainstream usage was well established by 1990:

  • (1980) EchoStar
  • (1984) BellSouth
  • (1985) EastEnders
  • (1986) SpaceCamp
  • (1990) HarperCollins, SeaTac
  • (1998) PricewaterhouseCoopers, merger of Price Waterhouse and Coopers

During the dot-com bubble of the late 1990s, the lowercase prefixes «e» (for «electronic») and «i» (for «Internet»,[32] «information», «intelligent», etc.) became quite common, giving rise to names like Apple’s iMac and the eBox software platform.

In 1998, Dave Yost suggested that chemists use medial capitals to aid readability of long chemical names, e.g. write AmidoPhosphoRibosylTransferase instead of amidophosphoribosyltransferase.[33] This usage was not widely adopted.

Camel case is sometimes used for abbreviated names of certain neighborhoods, e.g. New York City neighborhoods SoHo (South of Houston Street) and TriBeCa (Triangle Below Canal Street) and San Francisco’s SoMa (South of Market). Such usages erode quickly, so the neighborhoods are now typically rendered as Soho, Tribeca, and Soma.

Internal capitalization has also been used for other technical codes like HeLa (1983).

Current usage in computing[edit]

Programming and coding[edit]

The use of medial caps for compound identifiers is recommended by the coding style guidelines of many organizations or software projects. For some languages (such as Mesa, Pascal, Modula, Java and Microsoft’s .NET) this practice is recommended by the language developers or by authoritative manuals and has therefore become part of the language’s «culture».

Style guidelines often distinguish between upper and lower camel case, typically specifying which variety should be used for specific kinds of entities: variables, record fields, methods, procedures, functions, subroutines, types, etc. These rules are sometimes supported by static analysis tools that check source code for adherence.

The original Hungarian notation for programming, for example, specifies that a lowercase abbreviation for the «usage type» (not data type) should prefix all variable names, with the remainder of the name in upper camel case; as such it is a form of lower camel case.

Programming identifiers often need to contain acronyms and initialisms that are already in uppercase, such as «old HTML file». By analogy with the title case rules, the natural camel case rendering would have the abbreviation all in uppercase, namely «oldHTMLFile». However, this approach is problematic when two acronyms occur together (e.g., «parse DBM XML» would become «parseDBMXML») or when the standard mandates lower camel case but the name begins with an abbreviation (e.g. «SQL server» would become «sQLServer»). For this reason, some programmers prefer to treat abbreviations as if they were words and write «oldHtmlFile», «parseDbmXml» or «sqlServer».[34] However, this can make it harder to recognize that a given word is intended as an acronym.[35]

Difficulties arise when identifiers have different meaning depending only on the case, as can occur with mathematical functions or trademarks. In this situation changing the case of an identifier might not be an option and an alternative name need be chosen.

Wiki link markup[edit]

Camel case is used in some wiki markup languages for terms that should be automatically linked to other wiki pages. This convention was originally used in Ward Cunningham’s original wiki software, WikiWikiWeb,[36] and can be activated in most other wikis. Some wiki engines such as TiddlyWiki, Trac and PmWiki make use of it in the default settings, but usually also provide a configuration mechanism or plugin to disable it. Wikipedia formerly used camel case linking as well, but switched to explicit link markup using square brackets[37] and many other wiki sites have done the same. MediaWiki, for example, does not support camel case for linking. Some wikis that do not use camel case linking may still use the camel case as a naming convention, such as AboutUs.

Other uses[edit]

The NIEM registry requires that XML data elements use upper camel case and XML attributes use lower camel case.

Most popular command-line interfaces and scripting languages cannot easily handle file names that contain embedded spaces (usually requiring the name to be put in quotes). Therefore, users of those systems often resort to camel case (or underscores, hyphens and other «safe» characters) for compound file names like MyJobResume.pdf.

Microblogging and social networking services that limit the number of characters in a message are potential outlets for medial capitals. Using camel case between words reduces the number of spaces, and thus the number of characters, in a given message, allowing more content to fit into the limited space. Hashtags, especially long ones, often use camel case to maintain readability (e.g. #CollegeStudentProblems is easier to read than #collegestudentproblems);[38] this practice improves accessibility as screen readers recognize CamelCase in parsing composite hashtags.[39]

In website URLs, spaces are percent-encoded as «%20», making the address longer and less human readable. By omitting spaces, camel case does not have this problem.

Readability studies[edit]

Camel case has been criticized as negatively impacting readability due to the removal of spaces and uppercasing of every word.[40]

A 2009 study of 135 subjects comparing snake case (underscored identifiers) to camel case found that camel case identifiers were recognized with higher accuracy among all subjects. Subjects recognized snake case identifiers more quickly than camel case identifiers. Training in camel case sped up camel case recognition and slowed snake case recognition, although this effect involved coefficients with high p-values. The study also conducted a subjective survey and found that non-programmers either preferred underscores or had no preference, and 38% of programmers trained in camel case stated a preference for underscores. However, these preferences had no statistical correlation to accuracy or speed when controlling for other variables.[41]

A 2010 follow-up study used a similar study design with 15 subjects consisting of expert programmers trained primarily in snake case. It used a static rather than animated stimulus and found perfect accuracy in both styles except for one incorrect camel case response. Subjects recognized identifiers in snake case more quickly than camel case. The study used eye-tracking equipment and found that the difference in speed for its subjects was primarily due to the fact that average duration of fixations for camel-case was significantly higher than that of snake case for 3-part identifiers. The survey recorded a mixture of preferred identifier styles but again there was no correlation of preferred style to accuracy or speed.[42]

See also[edit]

  • All caps
  • Alternating caps
  • Capitalization
  • Caps lock
  • Kebab case
  • Naming convention (programming)
  • Shift key
  • Small caps
  • Snake case
  • Unicase

References[edit]

  1. ^ a b «Capitalization Styles — .NET Framework 1.1». Retrieved 5 December 2012.
  2. ^ «Naming Conventions». Scala. Retrieved 5 December 2012.
  3. ^ «Camel Case». Retrieved 10 March 2016.
  4. ^ «Ada 95 Quality and Style Guide». October 1995. Section 3.1.3. Retrieved 25 January 2020.
  5. ^ C# Coding Standards and Guidelines Archived 11 April 2008 at the Wayback Machine at Purdue University College of Technology
  6. ^ «CamelCase@Everything2.com». Everything2.com. Retrieved 4 June 2010.
  7. ^ a b Style Guide for Python Code at www.python.org
  8. ^ Feldman, Ian (29 March 1990). «compoundNames». Newsgroup: alt.folklore.computers. Usenet: 3230@draken.nada.kth.se.
  9. ^ «[#APF-1088] If class name has embedded capitals, AppGen code fails UI tests and generated hyperlinks are incorrect. – AppFuse JIRA». Issues.appfuse.org. Archived from the original on 25 June 2017. Retrieved 4 June 2010.
  10. ^ ASP Naming Conventions Archived 8 April 2009 at the Wayback Machine, by Nannette Thacker (05/01/1999)
  11. ^ Iverson, Cheryl; Christiansen, Stacy; Flanagin, Annette; Fontanarosa, Phil B.; Glass, Richard M.; Gregoline, Brenda; Lurie, Stephen J.; Meyer, Harriet S.; Winker, Margaret A.; Young, Rozanne K., eds. (2007). AMA Manual of Style (10th ed.). Oxford, Oxfordshire: Oxford University Press. ISBN 978-0-19-517633-9.
  12. ^ Hult, Christine A.; Huckin, Thomas N. «The Brief New Century Handbook – Rules for internal capitalization». Pearson Education. Archived from the original on 7 April 2012.
  13. ^ «What is the name for a word containing two capital letters (like WordPad)?». AskOxford. Internet Archive. Archived from the original on 25 October 2008. Retrieved 12 June 2022.
  14. ^ «Brad Abrams: History around Pascal Casing and Camel Casing». Blogs.msdn.com. 3 February 2004. Retrieved 4 January 2014.
  15. ^ «Pascal Case». C2.com. 27 September 2012. Retrieved 4 January 2014.
  16. ^ «NET Framework General Reference Capitalization Styles». MSDN2.microsoft.com. Retrieved 4 January 2014.
  17. ^ «WikiWord». Twiki.org. Retrieved 4 June 2010.
  18. ^ «Wiki Case». C2.com. 8 February 2010. Retrieved 4 June 2010.
  19. ^ Rappoport, Avi (3 April 1990). «compoundNames». Newsgroup: alt.folklore.computers.
  20. ^ Newton Love (12 September 1995). «I’m happy again! – comp.os.os2.advocacy | Google Groups». Groups.google.com. Retrieved 23 May 2009.
  21. ^ Newton Love[dead link]
  22. ^ Richtiges und gutes Deutsch: Das Wörterbuch der sprachlichen Zweifelsfälle. Duden (in German). Vol. 9 (7th ed.). Mannheim: Bibliographisches Institut. 2011. p. 418. ISBN 978-3411040971.
  23. ^ Fowler, Henry W.; Fowler, Francis G. (1908). «Chapter IV. Punctuation – Hyphens». The King’s English (2nd ed.). Oxford. Archived from the original on 31 December 2009. Retrieved 19 December 2009.
  24. ^ Jöns Jacob Berzelius (1813). Essay on the Cause of Chemical Proportions and on Some Circumstances Relating to Them: Together with a Short and Easy Method of Expressing Them. Annals of Philosophy 2, 443-454, 3, 51-52; (1814) 93-106, 244-255, 353-364.
  25. ^ Henry M. Leicester & Herbert S. Klickstein, eds. 1952, A Source Book in Chemistry, 1400-1900 (Cambridge, MA: Harvard)
  26. ^ The Trade-mark Reporter. United States Trademark Association. 1930. ISBN 1-59888-091-8.
  27. ^ «Mister Rogers Neighborhood Season 1 (Episode 4)». Retrieved 21 June 2022.
  28. ^ «Our History». unitedhealthgroup.com. Retrieved 15 May 2019.[permanent dead link]
  29. ^ ««Resume of Session 8″. Digital Computers: Advanced Coding Techniques. Summer Session 1954, Massachusetts Institute of Technology» (PDF). 1954. pp. 8–6. Archived from the original (PDF) on 29 February 2012. Retrieved 4 January 2014.
  30. ^ Strachey, Christopher (October 1965). «A General Purpose Macrogenerator». Computer Journal. 8 (3): 225–241. doi:10.1093/comjnl/8.3.225.
  31. ^ Niklaus Wirth (2007). «Modula-2 and Oberon». Proc. 3rd Conf. History of Programming Languages. Hopl III. San Diego: 3-1–3-10. CiteSeerX 10.1.1.91.1447. doi:10.1145/1238844.1238847. ISBN 9781595937667. S2CID 1918928.
  32. ^ Farhad Manjoo (30 April 2002). «Grads Want to Study on EMacs, Too». Wired.com. Retrieved 4 June 2010.
  33. ^ Feedback, 20 June 1998 Vol 158 No 2139 New Scientist 20 June 1998
  34. ^ «Google Java Style Guide». google.github.io. Retrieved 2 November 2022.
  35. ^ Dave Binkley; Marcia Davis; Dawn Lawrie; Christopher Morrell (2009). «To CamelCase or Under_score». IEEE 17th International Conference on Program Comprehension, 2009. ICPC ’09. IEEE: 158–167. CiteSeerX 10.1.1.158.9499. In terms of camel-cased identifiers, this has a greater impact on identifiers that include short words and especially acronyms. For example, consider the acronym ID found in the identifier kIOuterIIDPath. Because of the run of uppercase letters, the task of reading kIOuterIIDPath, in particular the identification of the word ID, is more difficult.
  36. ^ Andrew Lih, The Wikipedia Revolution: How a Bunch of Nobodies Created the World’s Greatest Encyclopedia (New York: Hyperion, 2009), pp. 57–58.
  37. ^ Lih, The Wikipedia Revolution, pp. 62–63, 67.
  38. ^ Blackwood, Jessica; Brown, Kate. «Accessible Use of CamelCase and Structuring Posts».
  39. ^ «Social Media Accessibility Guidelines».
  40. ^ Caleb Crain (23 November 2009). «Against Camel Case». New York Times.
  41. ^ Dave Binkley; Marcia Davis; Dawn Lawrie; Christopher Morrell (2009). «To CamelCase or Under_score». IEEE 17th International Conference on Program Comprehension, 2009. ICPC ’09. IEEE: 158–167. CiteSeerX 10.1.1.158.9499. The experiment builds on past work of others who study how readers of natural language perform such tasks. Results indicate that camel casing leads to higher accuracy among all subjects regardless of training, and those trained in camel casing are able to recognize identifiers in the camel case style faster than identifiers in the underscore style.
  42. ^ Bonita Sharif; Jonathan I. Maletic (2010). «An Eye Tracking Study on camelCase and under_score Identifier Styles». IEEE 18th International Conference on Program Comprehension, 20010. ICPC ’10. IEEE: 196–205. CiteSeerX 10.1.1.421.6137. doi:10.1109/ICPC.2010.41. ISBN 978-1-4244-7604-6. S2CID 14170019. (download PDF). An empirical study to determine if identifier-naming conventions (i.e., camelCase and under_score) affect code comprehension is presented. An eye tracker is used to capture quantitative data from human subjects during an experiment. The intent of this study is to replicate a previous study published at ICPC 2009 (Binkley et al.) that used a timed response test method to acquire data. The use of eye-tracking equipment gives additional insight and overcomes some limitations of traditional data gathering techniques. Similarities and differences between the two studies are discussed. One main difference is that subjects were trained mainly in the underscore style and were all programmers. While results indicate no difference in accuracy between the two styles, subjects recognize identifiers in the underscore style more quickly.

External links[edit]

Wikimedia Commons has media related to Camel case.

Look up camel case in Wiktionary, the free dictionary.

  • Examples and history of CamelCase, also WordsSmashedTogetherLikeSo
  • .NET Framework General Reference Capitalization Styles
  • What’s in a nAME(cq)?, by Bill Walsh, at The Slot
  • The Science of Word Recognition, by Kevin Larson, Advanced Reading Technology, Microsoft Corporation
  • Convert text to CamelCase
  • OASIS Cover Pages: CamelCase for Naming XML-Related Components
  • Convert text to CamelCase, Title Case, Uppercase and lowercase
  • Demystifying Common Casings in Programming: What They Are and When to Use Them

I want to replace any word which contains two capital letters .

here is my string

jennie-garth-jennie-garth-inner-city-arts-gala-october-17-2012-If9aSpTW
jennie-garth-jennie-garth3892-H9rDcbY 

i want to replace -If9aSpTW with —

These -If9aSpTW varies so I can’t use str_replace. I can identify with only two capital letter in one word. These words are at the end, but these types of words are appearing for 20% of total database values so I can’t replace all last words.

Wiktor Stribiżew's user avatar

asked Sep 29, 2016 at 6:01

Steeve's user avatar

6

The str_replace is context unaware, nor can you use substr since you need to check for 2 uppercase letters in the last non-hyphen chunk of the text. So you really have to stick to preg_replace regex based replacement.

You may use the following regex:

preg_replace('/-(?:[^-]*[A-Z]){2,}[^-]*$/', '', $str);

See the regex demo.

The pattern matches:

  • - — a hyphen
  • (?:[^-]*[A-Z]){2,} — 2 or more occurrences (due to {2,} limiting quantifier) of a sequence of:
    • [^-]*
    • [A-Z] — an uppercase
  • [^-]* — zero or more chars other than -
  • $ — end of string

PHP:

$str = 'jennie-garth-jennie-garth-inner-city-arts-gala-october-17-2012-If9aSpTWe';
echo preg_replace('/-(?:[^-]*[A-Z]){2,}[^-]*$/', '', $str);

answered Sep 29, 2016 at 6:13

Wiktor Stribiżew's user avatar

Wiktor StribiżewWiktor Stribiżew

601k37 gold badges427 silver badges539 bronze badges

2

Questions : Words with two or more capital letters in Java

2023-04-12T00:36:43+00:00 2023-04-12T00:36:43+00:00

956

Words with at least 2 Capital letters and questions uvdos java with any special letters (like questions uvdos java @#$%^&*()_-+= and so on…) optional.

I tried:

 public static boolean isWordHas2Caps(String s) {
        return s.matches("\b(?:\p{Ll}*\p{Lu}){2,}\p{Ll}*\b");
    }

But, I am getting

    System.out.println(isWordHas2Caps("eHJHJK"));
    System.out.println(isWordHas2Caps("YUIYUI"));
    System.out.println(isWordHas2Caps("LkfjkdJkdfj"));
    System.out.println(isWordHas2Caps("LLdkjkd"));
    System.out.println(isWordHas2Caps("OhdfjhdsjO"));
    System.out.println(isWordHas2Caps("LLLuoiu9898"));
    System.out.println(isWordHas2Caps("Ohdf&jh/dsjO"));
    System.out.println(isWordHas2Caps("auuuu"));
    System.out.println(isWordHas2Caps("JJJJJJJJ"));
    System.out.println(isWordHas2Caps("YYYY99999"));
    System.out.println(isWordHas2Caps("ooooPPPP"));

Output:

true   eHJHJK
true   YUIYUI
true   LkfjkdJkdfj
true   LLdkjkd
true   OhdfjhdsjO
false   LLLuoiu9898       It should be true but getting false
false   Ohdf&jh/dsjO      It should be true but getting false
false   auuuu
true   JJJJJJJJ
false   YYYY99999        It should be true but getting false
true   ooooPPPP

I think, I should in the regexp and numbers questions uvdos java and Special letters. How can I do that?

Total Answers 2

32

Answers 1 : of Words with two or more capital letters in Java

Update:

A valuable comment from anubhava:

Probably questions uvdos regex s.matches(«(?:\S*\p{Lu}){2}\S*»); questions uvdos regex may be better

Demo of the above solution.

Original answer:

You can use the regex, questions uvdos regex b.*p{Lu}.*p{Lu}.*b as shown below:

public static boolean isWordHas2Caps(String s) {
    return s.matches("\b.*\p{Lu}.*\p{Lu}.*\b");
}

Demo:

public class Main {
    public static void main(String[] args) {
        System.out.println(isWordHas2Caps("eHJHJK"));
        System.out.println(isWordHas2Caps("YUIYUI"));
        System.out.println(isWordHas2Caps("LkfjkdJkdfj"));
        System.out.println(isWordHas2Caps("LLdkjkd"));
        System.out.println(isWordHas2Caps("OhdfjhdsjO"));
        System.out.println(isWordHas2Caps("LLLuoiu9898"));
        System.out.println(isWordHas2Caps("Ohdf&jh/dsjO"));
        System.out.println(isWordHas2Caps("auuuu"));
        System.out.println(isWordHas2Caps("JJJJJJJJ"));
        System.out.println(isWordHas2Caps("YYYY99999"));
        System.out.println(isWordHas2Caps("ooooPPPP"));
    }

    public static boolean isWordHas2Caps(String s) {
        return s.matches("\b.*\p{Lu}.*\p{Lu}.*\b");
    }
}

Output:

true
true
true
true
true
true
true
false
true
true
true

0

2023-04-12T00:36:43+00:00 2023-04-12T00:36:43+00:00Answer Link

mRahman

2

Answers 2 : of Words with two or more capital letters in Java

You want to check if there are at least questions uvdos regex two uppercase letters anywhere in a questions uvdos regex string that can contain arbitrary chars.

Then, you can use

public static boolean isWordHas2Caps(String s) {
    return Pattern.compile("\p{Lu}\P{Lu}*\p{Lu}").matcher(s).find();
}

See the Java demo.

Alternatively, if you still want to use questions uvdos regex String#matches you can use the following questions uvdos regex (keeping in mind that we need to match questions uvdos regex the entire string):

public static boolean isWordHas2Caps(String s) {
    return s.matches("(?s)(?:\P{Lu}*\p{Lu}){2}.*");
}

The (?s)(?:\P{Lu}*\p{Lu}){2}.* regex questions uvdos regex matches

  • (?s) — the Pattern.DOTALL embedded flag option (makes . match any chars)
  • (?:P{Lu}*p{Lu}){2} — two occurrences of any zero or more chars other than uppercase letters and then an uppercase letter
  • .* — the rest of the string.

Your code did not return expected questions uvdos regex results because all of them contain questions uvdos regex non-letter characters, while questions uvdos regex String#matches() requires a full string questions uvdos regex match against a pattern, and yours questions uvdos regex matches strings that contains letters questions uvdos regex only.

That is why you should

  • Make sure you can match anywhere inside a string, and Matcher.find does this job best
  • p{Lu}P{Lu}*p{Lu} pattern will find any sequence of an uppercase letter + any zero or more non-letters + an uppercase letter
  • Alternatively, you can use (?s)(?:P{Lu}*p{Lu}){2}.* regex to match a full string that contains at least two uppercase letters.

0

2023-04-12T00:36:43+00:00 2023-04-12T00:36:43+00:00Answer Link

rohim

The str_replace is context unaware, nor can you use substr since you need to check for 2 uppercase letters in the last non-hyphen chunk of the text. So you really have to stick to preg_replace regex based replacement.

You may use the following regex:

preg_replace('/-(?:[^-]*[A-Z]){2,}[^-]*$/', '', $str);

See the regex demo.

The pattern matches:

  • - — a hyphen
  • (?:[^-]*[A-Z]){2,} — 2 or more occurrences (due to {2,} limiting quantifier) of a sequence of:
    • [^-]*
    • [A-Z] — an uppercase
  • [^-]* — zero or more chars other than -
  • $ — end of string

PHP:

$str = 'jennie-garth-jennie-garth-inner-city-arts-gala-october-17-2012-If9aSpTWe';
echo preg_replace('/-(?:[^-]*[A-Z]){2,}[^-]*$/', '', $str);

Like this post? Please share to your friends:
  • Word with track changes
  • Word with negative affix
  • Word with those letters
  • Word with muter in it
  • Word with these letters only