Barry Margolin
unread,
Apr 24, 1992, 1:32:24 AM4/24/92
to
Terminology question: are the Unix units of file storage called
«filesystems» (one word) or «file systems» (two words)? Neither is in
TNHD.
—
Barry Margolin
System Manager, Thinking Machines Corp.
bar…@think.com {uunet,harvard}!think!barmar
Kartik Subbarao
unread,
Apr 24, 1992, 3:51:15 AM4/24/92
to
In article <kveb78…@early-bird.think.com> bar…@think.com (Barry Margolin) writes:
>
>Terminology question: are the Unix units of file storage called
>»filesystems» (one word) or «file systems» (two words)? Neither is in
>TNHD.
Who cares? Do you «login» or do you «log in»? Does it matter?
-Kartik
The Jester
unread,
Apr 24, 1992, 8:40:09 AM4/24/92
to
subb…@phoenix.Princeton.EDU (Kartik Subbarao) writes:
Or logon or log on for that matter
> -Kartik
—
/ / / / / | Unexpected program termination has one saving grace —
\/ / / / /_ | you can usually guess how much data may have been lost.
/ / / / | —Jennifer Bonnitcha (Australian «journalist»)
g880…@cs.uow.edu.au
Jay Ashworth
unread,
Apr 24, 1992, 10:09:00 PM4/24/92
to
Barry Margolin, to All on Friday April 24 1992:
BM> Terminology question: are the Unix units of file storage called
BM> «filesystems» (one word) or «file systems» (two words)? Neither is in
BM> TNHD.
To the best of my knowledge, Barry, from 9 years at this…
The jargon term for what you boot from, or mount, on a unix system is, from the preponderance of the references I’ve seen, is «filesystem» (no space).
Cheers,
— jra
—————————————————————————
Jay R. Ashworth jra%ac…@tct.com
Ashworth & Associates Jay_Ashworth@{psycho.fidonet.org,
An Interdisciplinary Consultancy f160.n3603.z1.fidonet.org}
in Advanced Technology +1_813_449_UNIX@Long_Lines.com
—
Internet: Jay.As…@f160.n3603.z1.FIDONET.ORG
UUCP: …!uunet!myrddin!tct!psycho!160!Jay.Ashworth
Note:psycho is a free gateway between Usenet & Fidonet. For info write to
ro…@psycho.fidonet.org.
Charlie Gibbs
unread,
Apr 25, 1992, 12:13:12 AM4/25/92
to
In article <1992Apr24.0…@cs.uow.edu.au> g880…@cs.uow.edu.au
(The Jester) writes:
>>>Terminology question: are the Unix units of file storage called
>>>»filesystems» (one word) or «file systems» (two words)? Neither is in
>>>TNHD.
>
>>Who cares? Do you «login» or do you «log in»? Does it matter?
>Or logon or log on for that matter
My rule of thumb is to write it as one word if it’s being used
as a noun, and two words if a verb (phrase). Thus I «log in», while
the act of doing so is a «login».
Over the past few years I’ve noticed that people are running
words together more and more. I find this silly, irritating, or
misleading, depending on the state of my liver. Maybe it’s due to
influence from the German language, where it is standard to string
words together to produce some truly incredible compound words.
Charli…@mindlink.bc.ca
I’m trying to develop a photographic memory.
Jay Ashworth
unread,
Apr 25, 1992, 10:15:00 AM4/25/92
to
Kartik Subbarao, to All on Friday April 24 1992:
KS> In article <kveb78…@early-bird.think.com> bar…@think.com (Barry
KS> Margolin) writes:
>> Terminology question: are the Unix units of file storage called
>> «filesystems» (one word) or «file systems» (two words)? Neither is in
>> TNHD.
KS> Who cares? Do you «login» or do you «log in»? Does it matter?
Yes, Dammit!
GRAMMAR_SOAPBOX=ON; export GRAMMAR_SOAPBOX
‘login’ is a noun, used to refer to a prompt, or an occurance of a getty on a
terminal/port.
‘log in’ is a verb phrase, use to describe what you do when confronted by a
‘login’.
They are _not_ interchangable.
Since the other two are, to some extent at least, the analogy doesn’t hold.
GRAMMAR_SOAPBOX=OFF
Doug McNaught
unread,
Apr 25, 1992, 10:44:46 AM4/25/92
to
Terminology question: are the Unix units of file storage called
«filesystems» (one word) or «file systems» (two words)? Neither is in
TNHD.
My $0.02: filesystem. In my brain at least, it’s one concept ==> one word.
—
Barry Margolin
System Manager, Thinking Machines Corp.
bar…@think.com {uunet,harvard}!think!barmar
regards,
doug
—
<><><><><><><><><><><><><><><>Go Orioles<><><><><><><><><><><><><><><><>
<> Doug McNaught do…@cns.caltech.edu <>
<> Help!!! I’m addicted to *Spaceward Ho!* Is there a support group? <>
<><><><><><><><><><><><><><><>Go Orioles<><><><><><><><><><><><><><><><>
Lupe Christoph
unread,
Apr 25, 1992, 11:10:09 AM4/25/92
to
Charli…@mindlink.bc.ca (Charlie Gibbs) writes:
> Over the past few years I’ve noticed that people are running
>words together more and more. I find this silly, irritating, or
>misleading, depending on the state of my liver. Maybe it’s due to
>influence from the German language, where it is standard to string
>words together to produce some truly incredible compound words.
Donaudampfschiffahrtsgesellschaftskapitaen. This is the standard
«longest word ever used» in German. I don’t believe it was really
used, but who knows? It’s old, and it’s from Austria…
You want to know what it *means*? «Captain at the Danube Steam
Boat Corporation».
—
| …!unido!ukw!lupe (German EUNet, «bang») | Disclaimer: |
| lu…@ukw.UUCP (German EUNet, domain) | As I am self-employed, |
| suninfo!alanya!lupe (Sun Germany) | this *is* the opinion |
| Res non sunt complicanda praeter necessitatem. | of my employer. |
Arlie Davis
unread,
Apr 25, 1992, 9:38:51 PM4/25/92
to
> >>In article <kveb78…@early-bird.think.com> bar…@think.com (Barry
> >Margolin) writes:
> >>>Terminology question: are the Unix units of file storage called
> >>>»filesystems» (one word) or «file systems» (two words)? Neither is in
> >>>TNHD.
> My rule of thumb is to write it as one word if it’s being used
> as a noun, and two words if a verb (phrase). Thus I «log in», while
> the act of doing so is a «login».
So, to answer Barry’s question, «filesystem», since we file system something,
right?
—
______________
u.signature…
‘
Scott Bronson
unread,
Apr 26, 1992, 8:41:04 AM4/26/92
to
Yes. At least in the common vocabulary around here:
my login is scott (@mcl.mcl.ucsb.edu)
my log in today was around 2:45 and will end around 10:00 (hopefully).
In other words, you log in with your login and password.
— Scott
Steve Davis
unread,
Apr 26, 1992, 9:39:31 AM4/26/92
to
Jay.As…@f160.n3603.z1.FIDONET.ORG (Jay Ashworth) writes:
>Barry Margolin, to All on Friday April 24 1992:
> BM> Terminology question: are the Unix units of file storage called
> BM> «filesystems» (one word) or «file systems» (two words)? Neither is in
> BM> TNHD.
>To the best of my knowledge, Barry, from 9 years at this…
>The jargon term for what you boot from, or mount, on a unix system is, from the preponderance of the references I’ve seen, is «filesystem» (no space).
Eh. Excuse me to butt in here with a little bit of >EVIDENCE<.
All you people have to do is RTFM. Look at this:
%man -k (2) | grep file | grep system
getdents (2) — gets directory entries in a filesystem independent format
getdirentries (2) — gets directory entries in a filesystem independent format
statfs, fstatfs (2) — get file system statistics
ustat (2) — get file system statistics
%
Try this with ‘(8)’ for even more confusement. In short, use
whatever feels better to your fingers while you type it.
Stratocaster
—
Steve Davis | Contact me at … | The Boarding House BBS!
| Internet: st…@cis.ksu.edu | 9600 baud (v.32/v.42)
| FidoNet: Steve @ 1:295/3 | America: 913-827-0744
********** Have you hugged your Amiga today? **********
Jeffrey T. Hutzelman
unread,
Apr 26, 1992, 10:04:38 AM4/26/92
to
Hmmm… It seems that the consensus is that «log in» is a verb phrase
and «login» is a noun. However, we don’t seem to agree on exactly what
«login» MEANS…
— Jeffrey Hutzelman
jh…@andrew.cmu.edu, jh…@drycas.BITNET, or JeffreyH11 on America Online
Mr. John T Jensen
unread,
Apr 27, 1992, 1:11:55 AM4/27/92
to
lu…@ukw.uucp (Lupe Christoph) writes:
>Charli…@mindlink.bc.ca (Charlie Gibbs) writes:
>Donaudampfschiffahrtsgesellschaftskapitaen. This is the standard
>»longest word ever used» in German. I don’t believe it was really
>used, but who knows? It’s old, and it’s from Austria…
>You want to know what it *means*? «Captain at the Danube Steam
>Boat Corporation».
>—
Should that be ‘Captain at the Danube Steam Boat Transportation Corporation’?
^^^^^^^^^^^^^^
Doesn’t -fahrts- mean ‘travel’ or ‘transportation’ or something like that?
jj
John Thayer Jensen 64 9 373 7599 ext. 7543
Commerce Computer Services 64 9 373 7437 (FAX)
Auckland University jt.j…@aukuni.ac.nz
Private Bag 92019
AUCKLAND
New Zealand
Lupe Christoph
unread,
Apr 27, 1992, 11:33:13 AM4/27/92
to
st…@matt.ksu.ksu.edu (Steve Davis) writes:
>Jay.As…@f160.n3603.z1.FIDONET.ORG (Jay Ashworth) writes:
>>Barry Margolin, to All on Friday April 24 1992:
>> BM> Terminology question: are the Unix units of file storage called
>> BM> «filesystems» (one word) or «file systems» (two words)? Neither is in
>> BM> TNHD.
>>To the best of my knowledge, Barry, from 9 years at this…
>>The jargon term for what you boot from, or mount, on a unix system is, from the preponderance of the references I’ve seen, is «filesystem» (no space).
>Eh. Excuse me to butt in here with a little bit of >EVIDENCE<.
>All you people have to do is RTFM. Look at this:
>%man -k (2) | grep file | grep system
>getdents (2) — gets directory entries in a filesystem independent format
>getdirentries (2) — gets directory entries in a filesystem independent format
>statfs, fstatfs (2) — get file system statistics
>ustat (2) — get file system statistics
>%
>Try this with ‘(8)’ for even more confusement. In short, use
>whatever feels better to your fingers while you type it.
It just did a bit of research (this is SunOS 4.1.2):
grep ‘file system’ /usr/man/man*/* > ‘file system’
grep ‘filesystem’ /usr/man/man*/* > ‘filesystem’
wc -l file*
909 file system
207 filesystem
1116 total
This is 81% for «file system», 19% for «filesystem».
What does *your* vendor say?
Martin Schweikert
unread,
Apr 27, 1992, 1:16:59 PM4/27/92
to
Charli…@mindlink.bc.ca (Charlie Gibbs) writes:
> Over the past few years I’ve noticed that people are running
>words together more and more. I find this silly, irritating, or
>misleading, depending on the state of my liver. Maybe it’s due to
>influence from the German language, where it is standard to string
>words together to produce some truly incredible compound words.
Reminds me of an advertisement for a dictionary German/English where
they said it was «for those 10$ German words»!
Martin
—
M. Schweikert-Oberhausen/Germany…@cpp.ob.open.de / My life is based on
<>< Life-Net: martin_s…@credo.zer (Joh3:16) / two things: Belief in
Fax: +49 208 85 97 108, Phone: +49 208 85 97 142 / Christ and Murphy’s Law
Lupe Christoph
unread,
Apr 27, 1992, 9:30:21 PM4/27/92
to
>lu…@ukw.uucp (Lupe Christoph) writes:
>>Charli…@mindlink.bc.ca (Charlie Gibbs) writes:
>>Donaudampfschiffahrtsgesellschaftskapitaen. This is the standard
>>»longest word ever used» in German. I don’t believe it was really
>>used, but who knows? It’s old, and it’s from Austria…
>>You want to know what it *means*? «Captain at the Danube Steam
>>Boat Corporation».
>>—
>Should that be ‘Captain at the Danube Steam Boat Transportation Corporation’?
> ^^^^^^^^^^^^^^
>Doesn’t -fahrts- mean ‘travel’ or ‘transportation’ or something like that?
You’re right. That’s the trouble with long words, you easily overlook
something
Chris Flatters,208,7209,homephone
unread,
Apr 28, 1992, 12:42:45 AM4/28/92
to
Better than allowing vendors to set terminology standards: check what POSIX
standard uses. IEEE 1003.1 sanctifies the term «file system», with a space.
Chris Flatters
cfla…@nrao.edu
Charlie Gibbs
unread,
Apr 28, 1992, 4:31:50 AM4/28/92
to
In article <aldavi01….@starbase.spd.louisville.edu>
alda…@starbase.spd.louisville.edu (Arlie Davis) writes:
>> >>>Terminology question: are the Unix units of file storage called
>> >>>»filesystems» (one word) or «file systems» (two words)? Neither is in
>> >>>TNHD.
>
>> My rule of thumb is to write it as one word if it’s being used
>> as a noun, and two words if a verb (phrase). Thus I «log in», while
>> the act of doing so is a «login».
>
>So, to answer Barry’s question, «filesystem», since we file system something,
>right?
Argh. Why do I find myself wanting to call it a «file system»?
Probably because I look upon it as a system of files. For that matter,
why do we not run every adjective into the noun it modifies? Sigh…
Charli…@mindlink.bc.ca
«I’m cursed with hair from HELL!» — Night Court
John Benfield
unread,
Apr 28, 1992, 5:15:59 AM4/28/92
to
In article <tdfmj…@matt.ksu.ksu.edu>, st…@matt.ksu.ksu.edu (Steve Davis) writes:
>Jay.As…@f160.n3603.z1.FIDONET.ORG (Jay Ashworth) writes:
>
>>Barry Margolin, to All on Friday April 24 1992:
>
>> BM> Terminology question: are the Unix units of file storage called
>> BM> «filesystems» (one word) or «file systems» (two words)? Neither is in
>> BM> TNHD.
>
>>To the best of my knowledge, Barry, from 9 years at this…
>
>>The jargon term for what you boot from, or mount, on a unix system is, from the preponderance of the references I’ve seen, is «filesystem» (no space).
>
>Eh. Excuse me to butt in here with a little bit of >EVIDENCE<.
>All you people have to do is RTFM. Look at this:
>
>%man -k (2) | grep file | grep system
>getdents (2) — gets directory entries in a filesystem independent format
>getdirentries (2) — gets directory entries in a filesystem independent format
>statfs, fstatfs (2) — get file system statistics
>ustat (2) — get file system statistics
>%
>
>Try this with ‘(8)’ for even more confusement. In short, use
>whatever feels better to your fingers while you type it.
Oh…of course!!!! It must be true! They wrote it in the man pages! :}
filesystem: a logical storage allocation for the systematic organization
and containment of data.
file system: a large cabinet intended for the storage of files.
Calling a ‘filesystem’ a ‘file system’ is a Berkeleyism that never
quite died. The terms are only interchangeable if you get paid to
write shoddy man-pages or if your ‘filesystem’ is housed in a ‘file
system’
Of course…this is only my own personal reality as I know and
propagate it.
______Opinions stated are my own. Transcripts available by request______
===
=—==== AT&T Canada Inc. John Benfield
=—-==== 3650 Victoria Park Ave. Network Support Analyst (MIS)
=—-==== Suite 800
==—===== Willowdale, Ontario attmail : ~jbenfield
======= M2H-3P7 email : uunet!attcan!john
=== (416) 756-5221 Compu$erve: 72137,722
____Eagles may soar, but weasels don’t get sucked into jet engines._____
Anthony J Stieber
unread,
Apr 28, 1992, 9:17:24 PM4/28/92
to
But my «UNIX Tim-Sharing System: UNIX Programmer’s Manual 7th Edition»,
Copyright 1983, 1979, Bell Telephone Laboratories, Incorporated
uses both «filesystem» and «file system». The former tends to be
used when refering to command arguments, while the latter may or
may not be used elsewhere.
e.g. the dump man page has under FILES «filesystem and tape vary with
installation», while the df man page has «Default file systems vary
with installation.» The dcheck man page uses «file system» everywhere
except for the SYNOPSIS.
It seems that at least at this point the manuals were already
inconsistant. How much influence was there outside of Bell
at this point?
For those not familar with V7 Unix here are some of the new features
mentioned in the introduction:
file lengths now 32 bits rather than 24 bits.
lint
sed
stdio
make
—
<-:(= Anthony Stieber ant…@csd4.csd.uwm.edu uwm!uwmcsd4!anthony
Sam Wilson
unread,
Apr 29, 1992, 5:13:36 PM4/29/92
to
cfla…@nrao.edu (Chris Flatters,208,7209,homephone) writes:
> In article 73…@ukw.uucp, lu…@ukw.uucp (Lupe Christoph) writes:
> > …
> >It just did a bit of research (this is SunOS 4.1.2):
> > :
> > :
> >This is 81% for «file system», 19% for «filesystem».
> >
> >What does *your* vendor say?
>
> Better than allowing vendors to set terminology standards: check what POSIX
> standard uses. IEEE 1003.1 sanctifies the term «file system», with a space.
But isn’t the function of a standards body to make its standards just
that tiny, little bit different from what went before? Examples:
Ethernet and IEEE 802.3, Unix and POSIX etc etc.
[For the serious — the rationale is that then the original developers
don’t have a commercial lead whenthe standard is published.]
Sam
Christopher Davis
unread,
Apr 30, 1992, 1:59:16 AM4/30/92
to
Chris> == Chris Flatters,208,7209,homephone <cfla…@nrao.edu>
Chris> Better than allowing vendors to set terminology standards: check
Chris> what POSIX standard uses. IEEE 1003.1 sanctifies the term «file
Chris> system», with a space.
To some, this would be a good reason to go with «filesystem» instead.
(Hey, that’s what we need, a man(1) program that replaces ‘file system’
with ‘filesystem’, in accordance with POSIX standards, but only when
you’ve set the environment variable POSIX_ME_HARDER.)
—
Christopher Davis * c…@eff.org * System Administrator, EFF * +1 617 864 0665
Samizdata isn’t that different from Samizdat. — Dan’l Danehy-Oakes
Ignatios Souvatzis
unread,
May 28, 1992, 4:53:44 PM5/28/92
to
In article <1992Apr25….@ukw.uucp> lu…@ukw.uucp (Lupe
Christoph) writes:
Donaudampfschiffahrtsgesellschaftskapitaen. This is the standard
«longest word ever used» in German. I don’t believe it was really
used, but who knows? It’s old, and it’s from Austria…
You want to know what it *means*? «Captain at the Danube Steam
Boat Corporation».
… Steam Boat Travel/Transportation Corporation. «Schiffahrt» is used for
all sort of things you can do with ships and get paid for.
Btw, you can make the word longer without problem.
Try «Donaudampfschiffahrtsgesellschaftskapitaensmuetzenkordelknoten»,
which is the knot in the cord at his hat.
—
Paper mail: Ignatios Souvatzis, Radioastronomisches Institut der
Universitaet Bonn, Auf dem Huegel 71, D-5300 Bonn 1, FRG
Internet: so…@babsy.mpifr-bonn.mpg.de
Frank Stuart
unread,
May 30, 1992, 10:46:28 PM5/30/92
to
In article <SOUVA.92M…@aibn55.mpifr-bonn.mpg.de> isouv…@babsy.mpifr-bonn.mpg.de writes:
>In article <1992Apr25….@ukw.uucp> lu…@ukw.uucp (Lupe
>Christoph) writes:
>
> Donaudampfschiffahrtsgesellschaftskapitaen. This is the standard
> «longest word ever used» in German. I don’t believe it was really
> used, but who knows? It’s old, and it’s from Austria…
>
Hmmmm. Donaudampfschiffahrtsgesellschaftskapitaen.
antidisestablishmentarianism (longest English word)
No wonder German gives me such trouble. :>
Of course, it is quite possible that I misspelled antidisestablishmentarianism.
What does this have to do with filesystems? I don’t know. I don’t even know
how to pronounce Kibo.
Frank Stuart | Slower traffic keep right. | Don’t Panic
fst…@eng.auburn.edu | MMMMMmmmmm lutefisk. | Never moon a werewolf
Tim Rolfe
unread,
May 31, 1992, 12:57:37 AM5/31/92
to
In <fstuart.92…@lab16.eng.auburn.edu> fst…@eng.auburn.edu (Frank Stuart) writes:
[…]
>Hmmmm. Donaudampfschiffahrtsgesellschaftskapitaen.
> antidisestablishmentarianism (longest English word)
pneumonoultramicroscopicsilicovulcanoconiosis
From the Merriam-Webster’s Third International . . .
«A pneumoconiosis caused by the inhalation of very fine silicate or
quartz dust and occurring esp. in miners.»
(Amazing, the useless pieces of information laying around the brain!)
—
— Tim Rolfe
ro…@dsuvax.dsu.edu
RO…@SDNET.BITNET
Matthew Farwell
unread,
May 31, 1992, 1:02:36 AM5/31/92
to
In article <fstuart.92…@lab16.eng.auburn.edu> fst…@eng.auburn.edu (Frank Stuart) writes:
>In article <SOUVA.92M…@aibn55.mpifr-bonn.mpg.de> isouv…@babsy.mpifr-bonn.mpg.de writes:
>>In article <1992Apr25….@ukw.uucp> lu…@ukw.uucp (Lupe
>>Christoph) writes:
>> Donaudampfschiffahrtsgesellschaftskapitaen. This is the standard
>> «longest word ever used» in German. I don’t believe it was really
>> used, but who knows? It’s old, and it’s from Austria…
>>
>Hmmmm. Donaudampfschiffahrtsgesellschaftskapitaen.
> antidisestablishmentarianism (longest English word)
>
>No wonder German gives me such trouble. :>
>Of course, it is quite possible that I misspelled antidisestablishmentarianism.
You could have missed out the hyphen as well.
anti-disestablishmentarianism.
Dylan.
—
It is no coincidence that in no known language does the phrase ‘As
pretty as an Airport’ appear — Douglas Adams
bruce watson
unread,
May 31, 1992, 4:13:02 AM5/31/92
to
>No wonder German gives me such trouble. :>
>Of course, it is quite possible that I misspelled antidisestablishmentar
anism.
>
It’s correct.
>
>Frank Stuart | Slower traffic keep right. | Don’t Panic
>fst…@eng.auburn.edu | MMMMMmmmmm lutefisk. | Never moon a w
rewolf
—
___________________________________________________________________________
|wa…@isis.cs.du.edu | «I haven’t been this happy since Toonces the cat |
| Bruce Watson | ran over Skippy the dog.» |
John Hawkinson
unread,
May 31, 1992, 7:19:50 AM5/31/92
to
>>Hmmmm. Donaudampfschiffahrtsgesellschaftskapitaen.
>> antidisestablishmentarianism (longest English word)
>You could have missed out the hyphen as well.
> anti-disestablishmentarianism.
Isn’t it:
pneumonoultramicroscopicsilicovolcanoconioses
and pneumonoultramicroscopicsilicovolcanoconiosis
?? (from the wp5.1 dictionay)
—
John Hawkinson
jh…@panix.com
Lupe Christoph
unread,
May 31, 1992, 7:06:53 PM5/31/92
to
gm…@cunixa.cc.columbia.edu (Gabe M Wiener) writes:
>In article <1992May31.0…@panix.com> jh…@panix.com (John Hawkinson) writes:
>>
>>Isn’t it:
>> pneumonoultramicroscopicsilicovolcanoconioses
>>and pneumonoultramicroscopicsilicovolcanoconiosis
>Yup. Miner’s black-lung disease. Currently the longest accepted word
>in the language, though I’m sure any chemist could come up with names of
>chains that go on even longer.
Actually, that’s cheating. pneumonoultramicroscopicsilicovolcanoconioses
is a Greek word. Greek allows — like German — to concatenate words
to make new meanings.
What about plain old English???
The longest English word in /usr/dict/words on SunOS 4.1.2 are
counterproductive and indistinguishable (both 17 letters).
There are 15 Greek and Latin words of the same length or longer.
The longest is electroencephalography (22 letters).
Richard Marshall
unread,
Jun 1, 1992, 12:29:22 PM6/1/92
to
In article <1992May31….@ukw.uucp> lu…@ukw.uucp (Lupe Christoph) writes:
>gm…@cunixa.cc.columbia.edu (Gabe M Wiener) writes:
>
>>In article <1992May31.0…@panix.com> jh…@panix.com (John Hawkinson) writes:
>>>
>>>Isn’t it:
>>> pneumonoultramicroscopicsilicovolcanoconioses
>>>and pneumonoultramicroscopicsilicovolcanoconiosis
>
>>Yup. Miner’s black-lung disease. Currently the longest accepted word
>>in the language, though I’m sure any chemist could come up with names of
>>chains that go on even longer.
>
>Actually, that’s cheating. pneumonoultramicroscopicsilicovolcanoconioses
>is a Greek word. Greek allows — like German — to concatenate words
>to make new meanings.
>
>What about plain old English???
>
>The longest English word in /usr/dict/words on SunOS 4.1.2 are
>counterproductive and indistinguishable (both 17 letters).
>There are 15 Greek and Latin words of the same length or longer.
>The longest is electroencephalography (22 letters).
What’s wrong with antidisestablishmentarianism? (28 letters)
—
lah…@cck.cov.ac.uk R.J.Marshall Alias = Rambo Crewe Alexandra F.C.
«Parents of young organic life forms are warned that }_ HHGTTG 04/05/92
towels can be harmful if swallowed in large quantities.» } BBC Radio 4
Marius Milner
unread,
Jun 1, 1992, 1:46:00 PM6/1/92
to
The longest word in the Oxford English Dictionary is
floccinaucinihilipilification
which is one letter longer than the more famous
antidisestablishmentarianism
floccinaucinihilipilification (note that the ‘cc’ is
pronounced like the letter ‘X’) means ‘the act of
estimating something as worthless’.
Marius
Matthew Farwell
unread,
Jun 1, 1992, 2:54:47 PM6/1/92
to
anti-disestablishmentarianism is hyphenated.
Tony Lezard
unread,
Jun 1, 1992, 4:09:42 PM6/1/92
to
> >Isn’t it:
> > pneumonoultramicroscopicsilicovolcanoconioses
> >and pneumonoultramicroscopicsilicovolcanoconiosis
>
> Yup. Miner’s black-lung disease. Currently the longest accepted word
> in the language, though I’m sure any chemist could come up with names of
> chains that go on even longer.
Indeed they could.
Allow me to introduce tryptophan synthetase A protein, an enzyme with
267 amino acids.
It is spelt thus <f/x clears throat>:
Methionylglutaminylarginyltyrosylglutamylserylleucylphenylalanylalany
glutaminylleucyllysylglutamylarginyllysylglutamylglycylalanylphenylal
anylvalylprolylphenylalanylyalylthreonylleucylgylcylasparttlprolylgly
cylisoleucylglutamylglutaminylserylleucyllysylisoleucylaspartylthreony
lleucylisoleucylglutamylalanylglycylalanylaspartylalanylleucylglutamy
lleucylglycylisoleucylprolylphenylalanylserylaspartylprolylleucylalan
ylaspartylglycylprolylthreonylisoleucylglutaminylasparaginylalanylthr
eonylleucylarginylalanylphenylalanylalanylalanylglycylvalylthreonylpr
olylalanylglutaminylcysteinylphenylalanylglutamylmethionylleucyalanyl
leucylisoleucylarginylglutaminyllysylhistidylprolylthreonylisoleucylp
rolylisoleucylglyclleucylleucylmethionyltyrosylalanylasparaginylleucy
lvalylphenylalanylasparaginyllysylglycylisoleucylaspartylglutamylphen
ylalanyltyrosylalanylglutaminylcyteinylglutamyllysylvalylglycylvalylas
partylserylvalylleucyylvalylalanylaspartylvalylprolylvalyglutaminylglu
tamylserylalanylprolylphenylalanylarginylglutaminylalanylalanylleucylar
ginylhistidylasparaginylvalylalanylprolylisoleucylphenylalanylisoleucy
lcysteinylprolylprolylaspartylalanylaspartylaspartylaspartylleucylleuc
ylarginylglutaminylisoleucylalanylseryltyrosylglycylarginylglycyltyros
ylthreonyltyrosylleucylleucylserylarginylalanylglycylvalylthreonylglyc
ylalanylglutamylasparaginylarginylalanylalanylleucylprolylleucylaspara
ginylhistidylleucylvalylalanyllysylleucyllysylglutamyltyrosylasparagin
ylalanylalanylprolylprolylleucylglutaminylglycylphenylalanylglycylisol
eucylserylalanylprolylaspartylglutaminylvalyllysylalanylalanylisoleucy
laspartylalanylglycylalanylalanylglycylalanylisoleucyserylglycylserylal
anylisoleucylvalyllysylisoleucylisoleucylglutamylglutaminylhistidylaspa
raginylisoleucyglutamylprolylglutamyllysylmethionylleucylalanylalanylle
ucyllysylvalylphenylalanyivalyglutaminylprolylmethionyllysylalanylalany
lthreonylarginylserine.
+++ Tony Lezard: to…@mantis.co.uk or failing that, ar…@phx.cam.ac.uk +++
** Seeking accommodation (just nights) in/near Phoenix, Arizona for the **
** week starting September 6th. Luxury not a requirement and will pay! **
** Please email me if you know of someone who might be able to help. **
Thomas M Farrell
unread,
Jun 1, 1992, 7:46:22 PM6/1/92
to
In article <SOUVA.92M…@aibn55.mpifr-bonn.mpg.de> isouv…@babsy.mpifr-bonn.mpg.de writes:
>In article <1992Apr25….@ukw.uucp> lu…@ukw.uucp (Lupe
>Christoph) writes:
>
> Donaudampfschiffahrtsgesellschaftskapitaen. This is the standard
> «longest word ever used» in German. I don’t believe it was really
> used, but who knows? It’s old, and it’s from Austria…
>
Hmmmm. Donaudampfschiffahrtsgesellschaftskapitaen.
antidisestablishmentarianism (longest English word)
German:
Konstantinopolitanischerdudelsackpfeiffermachergeselle
English:
Pneumonoultramicroscopicsilicavolcanoconiosis
—
Tom Farrell __ gan…@dworkin.ccs.northeastern.edu (all of the time) ______
(c) 1992 __ / tfar…@lynx.northeastern.edu (presently down) /
/ tfar…@isis.cs.du.edu (forwards to first) Pink—> /
Are you cute, single, male, gay, and in the Boston area? Email me! /
Dan Tilque
unread,
Jun 2, 1992, 12:44:53 AM6/2/92
to
lu…@ukw.uucp (Lupe Christoph) writes:
>
>The longest English word in /usr/dict/words on SunOS 4.1.2 are
>counterproductive and indistinguishable (both 17 letters).
>There are 15 Greek and Latin words of the same length or longer.
>The longest is electroencephalography (22 letters).
You seem to be under the illusion that words made of elements from
another language belong to that language. Perhaps in German, that’s
the way things work, but not in English. English has borrowed all
kinds of words, roots, suffixes, and prefixes and they all become
English. Words made from these morphemes, e.g. electroencephalography,
are also English.
This is not to say that /usr/dict/words is all English words. There
are several foreign origin phrases which are commonly used in English
where the individual elements of those phrases are not English words.
However, some of those individual words are in /usr/dict/words. The
reason for this is so that ispell will not complain about the use of
these phrases. For example, «situ», «hoi», and «polloi» are in there
so that ispell will not flag «in situ» and «hoi polloi».
—
Dan Tilque — da…@logos.WR.TEK.COM
Lupe Christoph
unread,
Jun 1, 1992, 3:31:02 PM6/1/92
to
lah…@cck.coventry.ac.uk (Richard Marshall) writes:
>>What about plain old English???
>>
>>The longest English word in /usr/dict/words on SunOS 4.1.2 are
>>counterproductive and indistinguishable (both 17 letters).
>>There are 15 Greek and Latin words of the same length or longer.
>>The longest is electroencephalography (22 letters).
>What’s wrong with antidisestablishmentarianism? (28 letters)
Disqualified Anti is Greek. Dis is a Latin prefix.
I’ll accept establishmentarianism for 21 letters. It’s
in my 1977 Webster’s New Collegiate.
David Casseres
unread,
Jun 2, 1992, 2:46:34 AM6/2/92
to
In article <1992May31….@ukw.uucp>, lu…@ukw.uucp (Lupe Christoph)
writes:
> Actually, that’s cheating. pneumonoultramicroscopicsilicovolcanoconioses
> is a Greek word.
Betcha it isn’t!
> What about plain old English???
>
> The longest English word in /usr/dict/words on SunOS 4.1.2 are
> counterproductive and indistinguishable (both 17 letters).
> There are 15 Greek and Latin words of the same length or longer.
> The longest is electroencephalography (22 letters).
Sorry. I don’t really know about «pneumonoultramicroscopic-
silicovolcanoconioses,» but «electroencephalography» is ENGLISH. Technical
English, to be sure, but are you going to claim that any technical word made up
of Greek and/or Latin roots is Greek or Latin?
—
David Casseres
Exclaimer: Wow!
Emory F. Bunn
unread,
Jun 2, 1992, 4:52:15 AM6/2/92
to
In article <1992Jun1.1…@ukw.uucp> lu…@ukw.uucp (Lupe Christoph) writes:
:lah…@cck.coventry.ac.uk (Richard Marshall) writes:
:
(Stuff deleted)
:
:>What’s wrong with antidisestablishmentarianism? (28 letters)
:
:Disqualified Anti is Greek. Dis is a Latin prefix.
:I’ll accept establishmentarianism for 21 letters. It’s
:in my 1977 Webster’s New Collegiate.
:—
By this logic, you must refuse to accept «television» as a word.
(«tele-» is Greek; «vision» is from the Latin.)
-Ted
Arlie Davis
unread,
Jun 2, 1992, 10:49:46 AM6/2/92
to
In <10egnv…@agate.berkeley.edu> ted@physics3 (Emory F. Bunn) writes:
[…]
> :Disqualified Anti is Greek. Dis is a Latin prefix.
> :I’ll accept establishmentarianism for 21 letters. It’s
> :in my 1977 Webster’s New Collegiate.
> By this logic, you must refuse to accept «television» as a word.
> («tele-» is Greek; «vision» is from the Latin.)
And «automobile».
«Such a thing [the car], if it were to exist, would clearly be called either
an ‘autokinesin’, or an ‘isomobile’.» — Goethe, on cars.
cgp…@minster.york.ac.uk
unread,
Jun 1, 1992, 11:39:29 PM6/1/92
to
> antidisestablishmentarianism (longest English word)
^
Actually that isn’t one word, it should have a hyphen after the anti. Sorry.
Chris P-E. cgp…@minster.york.ac.uk
Ian Gent
unread,
Jun 2, 1992, 6:00:57 PM6/2/92
to
No it isn’t! The very long p-word (which somebody cited as
> pneumonoultramicroscopicsilicovolcanoconiosis
but I don’t vouch for it)
is indeed in the Oxford English Dictionary — in the supplement to the 1st
edition and presumably in the 2nd.
Although the OED gives its meaning as a disease, it also notes that it
is almost always only ever used as an example of a very long word — and
all their quotations just cite it as a long word.
Ian
Alfvaen
unread,
Jun 2, 1992, 11:54:57 PM6/2/92
to
Ian Gent writes
That sounds like the word «floccinoccinihilipilification»…in my Random
House dictionary, it’s listed as «rare» and notes that it’s used mainly as
an example as a very long word. I think it means «The act of estimating as
worthless», which makes the word somewhat self-referential…;-}
—
—Alfvaen(a.k.a. Aaron V. Humphrey)
Canadian Network For Space Research, Edmonton, Alberta, Canada
Her hair spilled out like rootbeer…
Current Album—Comedy Classics
Evan Kirshenbaum
unread,
Jun 2, 1992, 9:20:21 PM6/2/92
to
In article <1992May31….@ukw.uucp> lu…@ukw.uucp (Lupe Christoph) writes:
>The longest English word in /usr/dict/words on SunOS 4.1.2 are
>counterproductive and indistinguishable (both 17 letters).
Now that we’ve heard your judgement of them, what are the words?
(And I have a hard time seeing how two 17 letter words can be
indistinguishable.)
>There are 15 Greek and Latin words of the same length or longer.
>The longest is electroencephalography (22 letters).
You mean «electroencephalography» was borrowed from Greek? or was it
Latin? (What did they call it before the Norman conquest?) Or do you
mean that the word was created as an English word by combining
(originally foreign) morphemes using still-productive rules?
In any case, «antidisestablishmentarianism» still seems like the
longest (and it *is* in /usr/dict/words on HP-UX 8.0). Even if you
insist that it should be «anti-» (and I don’t believe it is/was
spelled that way consistently), «disestablishmentarianism» has 24
letters. Neither one seems to get much use these days.
«Electroencephalography» is probably the longest you’re likely to hear
(outside of a «what’s the longest word in English» puzzle).
Evan Kirshenbaum
HP Laboratories
3500 Deer Creek Road, Building 26U
Palo Alto, CA 94304
kirsh…@hplabs.hp.com
(415)857-7572
Michael Qvortrup
unread,
Jun 2, 1992, 9:05:45 PM6/2/92
to
No, most certainly not. You have concatenated an adjective and a noun. This
is, in this case, clearly not correct (there might be cases where you can
do it, but I can’t think of any right now, and I doubt they exist). German
allows a somewhat ridiculous concatenation of nouns (see the Danube captain
above).
The correct phrase would be
Konstantinopolitanischer Dudelsackpfeiffenmachergeselle
^ it is plural!
which is most certainly two words. The adjective might take the price for
the longest adjective, though.
These German noun concatenations would most probably not be accepted as
‘proper, separate words’ in the sense, that they would make it into a
dictionary.
Greetings,
—Mike
—
#include <std-disclm.h>—«… and there is a small flaw in my character.»—
Real Life: Michael Christian Heide Qvortrup A Dane ETH, Zuerich
e-mail : qvor…@inf.ethz.ch abroad Switzerland
Institut fuer wissenschaftliches Rechnen / Inst. of Scientific Computation
Gavin Williams
unread,
Jun 3, 1992, 2:05:23 AM6/3/92
to
I can’t remember the exact word, but it begins with the letters ‘floppi’
and is described as ‘ The art of estimating something as being useless’.
—
It matters not how strait the gate,
Gavin… How charged with punishments the scroll,
will…@unix1.tcd.ie I am the master of my fate:
I am the captain of my soul. — W.E. Henley
Dave Brown
unread,
Jun 3, 1992, 5:07:04 AM6/3/92
to
In article <1992Jun2.1…@aisb.ed.ac.uk> i…@aifh.ed.ac.uk (Ian Gent) writes:
>>In article <1992Jun1.0…@rdg.dec.com> mar…@ed.ac.uk writes:
>>>
>>>The longest word in the Oxford English Dictionary is
>>> floccinaucinihilipilification
>
>No it isn’t! The very long p-word (which somebody cited as
>
>> pneumonoultramicroscopicsilicovolcanoconiosis
>
>but I don’t vouch for it)
>
>is indeed in the Oxford English Dictionary — in the supplement to the 1st
>edition and presumably in the 2nd.
>
Okay, floccinaucinihipilification (I mentioned somebody with a Random
House dictioanary claim that the spelling was floccinoccinihipilification,
but (amazingly enough) that’s an Americanized spelling) may not be the
longest word to be found in a dictionary, but as far as I know,
floccinaucinihipilificate is the longest root word in English….
Dave Brown
dagb…@descartes.waterloo.edu
Bill Squire
unread,
Jun 3, 1992, 12:42:06 PM6/3/92
to
lu…@ukw.uucp (Lupe Christoph) writes:
> gm…@cunixa.cc.columbia.edu (Gabe M Wiener) writes:
> Actually, that’s cheating. pneumonoultramicroscopicsilicovolcanoconioses
> is a Greek word. Greek allows — like German — to concatenate words
> to make new meanings.
>
> What about plain old English???
>
> The longest English word in /usr/dict/words on SunOS 4.1.2 are
> counterproductive and indistinguishable (both 17 letters).
> There are 15 Greek and Latin words of the same length or longer.
> The longest is electroencephalography (22 letters).
I have come to believe the longest «real English» word is
incomprehensibilities at 21 letters. Electroencephalography is really a
technical term and a concatenated word as well.
Bill Squire (bi…@hacktic.nl)
——————————————————-
Lupe Christoph
unread,
Jun 2, 1992, 11:16:05 PM6/2/92
to
alda…@draconis.spd.louisville.edu (Arlie Davis) writes:
>In <10egnv…@agate.berkeley.edu> ted@physics3 (Emory F. Bunn) writes:
>[…]
>> :Disqualified Anti is Greek. Dis is a Latin prefix.
>> :I’ll accept establishmentarianism for 21 letters. It’s
>> :in my 1977 Webster’s New Collegiate.
>> By this logic, you must refuse to accept «television» as a word.
>> («tele-» is Greek; «vision» is from the Latin.)
>And «automobile».
Oh, I don’t reject those as words. They’re perfectly good in everyday
life It’s just that these words rely on forming rules alien
to English. In English, you don’t use a wordsruntoghethermethod.
>»Such a thing [the car], if it were to exist, would clearly be called either
>an ‘autokinesin’, or an ‘isomobile’.» — Goethe, on cars.
I like Ipsokineton better. (Iso is Greek, from isos (equal);
ipse is latin for «the same». Kinetos means «moved», kinesin
is the accusativ of kinesis, movement. I would readily agree
that kinetos is nor equivalent to mobilis. ;-))
Gavin Williams
unread,
Jun 3, 1992, 12:42:40 PM6/3/92
to
In <VsgRLB…@hacktic.nl> bi…@hacktic.nl (Bill Squire) writes:
>lu…@ukw.uucp (Lupe Christoph) writes:
>I have come to believe the longest «real English» word is
>incomprehensibilities at 21 letters. Electroencephalography is really a
>technical term and a concatenated word as well.
Actually, antidisestablishmentarianism is longer, but, as I said in
another post, floppi……….something is the longest.
He that would sup with The Devil
unread,
Jun 3, 1992, 6:04:24 AM6/3/92
to
In article <1992Jun2.1…@ukw.uucp> lu…@ukw.uucp (Lupe Christoph) writes:
> Oh, I don’t reject those [antidisestablishmentarianism] as words.
> They’re perfectly good in everyday life It’s just that these
> words rely on forming rules alien to English. In English, you don’t
> use a wordsruntoghethermethod.
Perhaps I’m mistaken, but you’re not a native speaker of English,
are you?
`Antidisestablishmenatrianism’ has only one stem word: `Establish’.
Although it’s uncommon in English to concatentate more than two stem
words, and we don’t have any constructions like German’s
`Schutzengrabenvernichtungspanzerkraftwagen’, it’s common to append more
than two affixes, such as `anti-‘, `dis-‘, `-ment’, `-arian’, or `-ism’.
I am sure few people would object to `antipostmodernism’, for example.
—
And for to see, and eke for to be seye
Mark-Jason Dominus m…@central.cis.upenn.edu
Richard Marshall
unread,
Jun 1, 1992, 7:52:12 PM6/1/92
to
There’s one problem with this. How do you fit it all on a SCRABBLE board? B-)
David Casseres
unread,
Jun 3, 1992, 10:12:27 PM6/3/92
to
In article <1992Jun2.1…@ukw.uucp>, lu…@ukw.uucp (Lupe Christoph)
writes:
> Oh, I don’t reject [television and automobile] as words. They’re perfectly
> good in everyday life It’s just that these words rely on forming rules
> alien to English. In English, you don’t use a wordsruntoghethermethod.
Who don’t? Have you ever used a handbook? Played a woodwind instrument? Been
caught in a downpour or a cloudburst? Played football? Eaten mincemeat?
Tom Watson
unread,
Jun 4, 1992, 1:30:37 AM6/4/92
to
Super-cali-fragil-istic-expli-elli-docious
Please no ‘spelling’ flames, I never saw the movie.
—-
Tom Watson
johana!t…@apple.com
Hannu Helminen ti
unread,
Jun 4, 1992, 1:55:02 AM6/4/92
to
What about Finnish:
Ep{j{rjestelm{llistytt{m{tt|myydell{ns{k{{n
({ is letter a with two dots over it)
«Ep{» is actually a prefix, but you can use the word without it.
Please don’t ask me to explain what it means
—
_| _ _ Hannu d…@stekt.oulu.fi // Does anybody else in here
(_|| | ) Helminen d…@phoenix.oulu.fi X/ feel the way I do?
Charlie Gibbs
unread,
Jun 4, 1992, 4:02:48 AM6/4/92
to
In article <26…@goofy.Apple.COM> johana!t…@apple.com (Tom Watson)
writes:
>Super-cali-fragil-istic-expli-elli-docious
>
>Please no ‘spelling’ flames, I never saw the movie.
Neither did I. But «The Family Circus», normally a terminally
cute comic strip, did come out with what might be considered a
mnemonic, in the form of a series of little drawings of:
A cow wearing a cape, flying through the air
A list of delicate items of china
A piece of a small tree branch
A couple of eggs
A single pea
A back lane
A bag of money
Someone motioning to be quiet
which was interpreted as:
Super-cow, a fragile list, stick, eggs, pea, alley, dough, shush!
BTW if «antidisestablishmentarianism» has a hyphen, it must
have been an addition. The foot-thick «Webster’s Twentieth-Century
Dictionary» that’s been in the family as long as I remember (and
probably longer, it’s dated 1937) contains an addendum that not
only gives a definition, but also spells it without the hyphen.
As for more commonly used long words, I’m surprised that
nobody has yet mentioned those darlings of the industry,
«interoperability» and «internationalization», which are
commonly abberviated to «i14y» and «i18n» respectively.
Charli…@mindlink.bc.ca
«I’m cursed with hair from HELL!» — Night Court
Kristian Koehntopp
unread,
Jun 4, 1992, 2:01:09 AM6/4/92
to
A
«Dudelsackpfeiffenmachergesellenpruefungskommisonsvorsitzende»
is the chairlady of the commitee for the examination of a bag
pipe makers apprentice. The example can easily be extended as in
«Dudelsackpfeiffenmachergesellenpruefungskommisonsvorsitzendentochter»
which is her daugther. You do not find this in any dictionary
but it follows the common rules for german word concatenation
and is — well — understood.
Kristian
—
Kristian Koehntopp, Harmsstrasse 98, FRG W-2300 Kiel, +49 431 676689
SN182A-102 User too stupid error.
Colin Dente
unread,
Jun 3, 1992, 5:54:08 PM6/3/92
to
|> Actually, antidisestablishmentarianism is longer, but, as I said in
|> another post, floppi……….something is the longest.
I thought that floccinaucinihillipillification (that’s spelt straight off the
top of my head, so it may well be wrong) was made up at Eton as a latin-related
joke. Unfortunately, my OED is at home, and the concise doesn’t have it.
Colin
—
Colin Dente | JANET: de…@uk.ac.manchester
Manchester Computing Centre | ARPA: de…@manchester.ac.uk
University of Manchester, UK | UUCP: …!mcsun!ukc!manchester!dente
… Blatantly Bisexual … | B3(4) f+ c g k r s
dks
unread,
Jun 4, 1992, 6:32:41 AM6/4/92
to
Super-cali-fragilistic-expi-ali-docious.
(Even though the sound of it is something quite atrocious)
>Please no ‘spelling’ flames, I never saw the movie.
Not a flame but a material correction,
given that we are discussing word-length.
Cheers!
— Dhanesh
tha…@desire.wright.edu
unread,
Jun 3, 1992, 7:41:30 PM6/3/92
to
In article <MJD.92Ju…@saul.cis.upenn.edu>,
m…@saul.cis.upenn.edu (He that would sup with The Devil) writes:
>
> Perhaps I’m mistaken, but you’re not a native speaker of English,
> are you?
>
> `Antidisestablishmenatrianism’ has only one stem word: `Establish’.
> Although it’s uncommon in English to concatentate more than two stem
> words, and we don’t have any constructions like German’s
> `Schutzengrabenvernichtungspanzerkraftwagen’ …
I’m a native English speaker (well, *American* English …) and I speak no
German, but I am intrigued: what the *HECK* is «schutzen… (etc., et al.)»?
And reagrdless of whatever it means, why wouldn’t the typical German
come up with a shorter word? Maybe such as the (ahem) all-American
«thingamabob»?
——ted hayes
«thingamabob’s your uncle!»
Mejia Pablo
unread,
Jun 4, 1992, 10:26:57 AM6/4/92
to
Super-cali-fragi-listic-ixpi-alle-docious
Please no ‘spelling’ flames, I never saw the movie.
Sorry, couldn’t resist.
—
*** Theory is when we know everything but nothing goes right. ***
*** Practice is when everything goes right but nobody knows why. ***
*** Here, we have an harmonious mix between theory and practice: ***
*** nothing goes right and nobody knows why. ***
Peter Moylan
unread,
Jun 4, 1992, 10:41:10 AM6/4/92
to
In article <1992Jun2.1…@ukw.uucp>, lu…@ukw.uucp (Lupe Christoph) writes:
>>> By this logic, you must refuse to accept «television» as a word.
>>> («tele-» is Greek; «vision» is from the Latin.)
>
>>And «automobile».
>
> Oh, I don’t reject those as words. They’re perfectly good in everyday
> life It’s just that these words rely on forming rules alien
> to English. In English, you don’t use a wordsruntoghethermethod.
I all most gain said that, but in hind sight I saw that no body would
under stand the out land ish words I would un think ingly have run
to gether.
—
Peter Moylan ee…@wombat.newcastle.edu.au
Charles Lasner
unread,
Jun 4, 1992, 7:07:18 PM6/4/92
to
Don’t know about the longest word, but how ’bout the longest pop song title:
Jan and Dean’s
Anaheim, Azusa, Cookamonga Sewing Circle, Book Review and Timing Association.
Do such organizations with multi-purpose names actually exist?
I have heard of the Brooklyn Heights Poker and Literary Guild BTW. They were
once peripherally involved with certain computer people, etc., a long story
to be repeated in another post.
cjl (my initials don’t form a long word either)
Hans Mulder
unread,
Jun 4, 1992, 9:13:11 PM6/4/92
to
In <1992Jun2.1…@ukw.uucp> lu…@ukw.uucp (Lupe Christoph) writes:
>Oh, I don’t reject those as words. They’re perfectly good in everyday
^^^^^^^^
>life It’s just that these words rely on forming rules alien
>to English. In English, you don’t use a wordsruntoghethermethod.
Care to explain how the word «everyday» came into being if it wasn’t by
using the wordsruntogethermethod? Or is «everyday» a Greek loan word, too?
—
Hans Mulder ha…@cs.kun.nl
Alfvaen
unread,
Jun 4, 1992, 9:39:40 PM6/4/92
to
Charles Lasner writes
> Don’t know about the longest word, but how ’bout the longest pop song
title:
>
> Jan and Dean’s
>
> Anaheim, Azusa, Cookamonga Sewing Circle, Book Review and Timing
Association.
AACSCBRATA. Almost a pronounceable acronym.
> Do such organizations with multi-purpose names actually exist?
>
> I have heard of the Brooklyn Heights Poker and Literary Guild BTW. They
were
> once peripherally involved with certain computer people, etc., a long
story
> to be repeated in another post.
Longest band name I’ve heard:
We’ve Got A Fuzzbox And We’re Going To Use It
Longest album title I can think of offhand:
Edie Brickell & the New Bohemians’
Shooting Rubberbands At The Stars
(Terence Trent D’Arby’s Neither Fish Nor Flesh is longer if you include the
subtitle, which I can never remember.)
By the way, does anyone know if there’s truth to the rumour that Paul Simon
and Edie Brickell recently got married? I heard it on talk.bizarre, so I’m
skeptical…
> cjl (my initials don’t form a long word either)
—
—Alfvaen(a.k.a. Aaron V. Humphrey)
Canadian Network For Space Research, Edmonton, Alberta, Canada
Her hair spilled out like rootbeer…
Current Album—Red Rider:Neruda
Andrew Rogers
unread,
Jun 4, 1992, 11:22:07 PM6/4/92
to
In article <1992Jun4.1…@kakwa.ucs.ualberta.ca> aa…@space.ualberta.ca (Alfvaen) writes:
>Charles Lasner writes
>> Don’t know about the longest word, but how ’bout the longest pop song
>title:
>>
>> Jan and Dean’s
>>
>> Anaheim, Azusa, Cookamonga Sewing Circle, Book Review and Timing
>Association.
Not even close. How about Hoagy Carmichael’s WWII classic, «I’m A Cranky
Old Yank in a Clanky Old Tank on the Streets of Yokohama with my <mumble
mumble> Mama Doin’ Those Neat-o, Beat-o, Flat-on-my-seat-o Hirohito Blues»?
>Longest band name I’ve heard:
>
>We’ve Got A Fuzzbox And We’re Going To Use It
Nope. Try «The Rock And Roll Dubble Bubble Trading Card Company of
Philadelphia 19141″, who actually charted in the late 60’s with (what else?)
«Bubble Gum Music».
>Longest album title I can think of offhand:
>
>Edie Brickell & the New Bohemians’
>
>Shooting Rubberbands At The Stars
There’s a late-60’s Tyrannosaurus Rex LP with a title something like «My
people were fair <mumble mumble> stars in their hair but now they’re
content <mumble mumble> on their brows».
>By the way, does anyone know if there’s truth to the rumour that Paul Simon
>and Edie Brickell recently got married? I heard it on talk.bizarre, so I’m
>skeptical…
Yes; it’s been in all the newspapers. But is there any truth to the rumor
that EB is Jerry Garcia’s daughter?
AWR
CP/M lives!
unread,
Jun 5, 1992, 12:35:58 AM6/5/92
to
They do if they’re null-terminated:
.ASCIZ /cjl/
is one longword on a VAX.
BTW, have you considered using other than the first letter of your middle
and last names so that your initials could be three hexadecimal digits,
i.e., twelve bits?
Finally, if I think of PDP-8 data as three hexadecimal digits, or three
nybbles, is that a trybble?
Roger Ivie
iv…@cc.usu.edu
Frank Stuart
unread,
Jun 3, 1992, 8:10:36 PM6/3/92
to
>>floccinaucinihilipilification (note that the ‘cc’ is
>>pronounced like the letter ‘X’) means ‘the act of
>>estimating something as worthless’.
>
>anti-disestablishmentarianism is hyphenated.
Does the hyphen count as a letter when figuring its length?
(Is that reason enough for the floccinaucinihilipilification of this thread?)
Frank Stuart | Slower traffic keep right. | Don’t Panic
fst…@eng.auburn.edu | MMMMMmmmmm lutefisk. | Never moon a werewolf
Charlie Gibbs
unread,
Jun 5, 1992, 4:24:22 AM6/5/92
to
In article <1992Jun4.1…@news.columbia.edu>
las…@watsun.cc.columbia.edu (Charles Lasner) writes:
>Don’t know about the longest word, but how ’bout the longest pop song title:
>
>Jan and Dean’s
>
>Anaheim, Azusa, Cookamonga Sewing Circle, Book Review and Timing Association.
Here’s one by Shawn Phillips (hope I remember it right):
She was waiting for her mother at the station in Torino and
you know I love you baby but it’s getting too heavy to laugh
Then there’s Pink Floyd’s
Several species of small furry animals gathered together
in a cave and grooving with a Pict
(Do your own capitalization; I’m tuckered out just typing them in.
Any more?
Charli…@mindlink.bc.ca
I’m looking for the stationery department but they keep moving it.
Garrett Wollman
unread,
Jun 5, 1992, 5:24:32 AM6/5/92
to
In article <DM.92Ju…@stekt2.oulu.fi> d…@stekt2.oulu.fi (Hannu Helminen ti) writes:
>What about Finnish:
>Ep{j{rjestelm{llistytt{m{tt|myydell{ns{k{{n
>({ is letter a with two dots over it)
>»Ep{» is actually a prefix, but you can use the word without it.
>Please don’t ask me to explain what it means
Ah, but «h{{y|aie» is so much more impressive (one consonant, seven
vowels) and just as vacuous… And don’t forget that your `k{{n’,
`ns{‘, `ll{‘, `myys’, and `m{t|n’ are all suffixes… Pity I can’t
figure out what it means to j{rjestelm{llistytt{{. (I think I got
that right.)
-GAWollman
ObComputers: ISO 646 is such a pain, but MIME is not well-understood
as yet and my display doesn’t do 8859-1 anyway. And where did IBM get
«Code Page 850», or whatever the character set that my PC displays is
called, from anyway?
—
Garrett A. Wollman = wol…@uvm.edu = UVM is welcome to my opinions
= uvm-gen!wollman =
That’s what being alive is all about. No deity, no higher goal
exists, than to bring joy to another person. — Elf Sternberg
Chip Olson
unread,
Jun 5, 1992, 6:41:56 AM6/5/92
to
>Charles Lasner writes
>> Don’t know about the longest word, but how ’bout the longest pop song
>title:
>>
>> Jan and Dean’s
>>
>> Anaheim, Azusa, Cookamonga Sewing Circle, Book Review and Timing
>Association.
>
>AACSCBRATA. Almost a pronounceable acronym.
Camper Van Beethoven’s _The Third Album plus Vampire Can Mating Oven_ CD has
a wierd instrumental called _Processional_, which in the liner notes is
mentioned as having originally been entitled, «Why don’t you challenge the
boundaries of rock music by playing harsh furious dissonant guitar 2noise
music with lyrics exclusively about death and sex and pretend like you are
making some kind of original statement about the relation between the two
and therefore expressing the pain and confusion of modern society, and then
become a rock critic and write about your own band under a different name
but not before you move to New York or LA or Chicago or some sufficiently
urban area and live in a bad part of town while still receiving checks from
your parents who were probably liberals and didn’t let you watch enough
violence on TV and so you never got it out of your system, and then go to
law school like everyone else».
Beat that.
>Longest album title I can think of offhand:
>
>Edie Brickell & the New Bohemians’
>
>Shooting Rubberbands At The Stars
Well, the aforementioned CVB album is up there, although it’s 2 albums on
one CD. They also have albums entitled _Our Beloved Revolutionary Sweet-
heart_ and _Telephone Free Landslide Victory_. Then of course there’s
_Liquid Acrobat As Regards The Air_ by The Incredible String Band.
—
-Chip Olson. | ol…@husc.harvard.edu | c…@gnu.ai.mit.edu (seldom used)
«‘Cause what the world needs now are some true words of wisdom,
like la la la la la la la la la…» -Cracker.
Thomas Farmer
unread,
Jun 5, 1992, 3:39:00 AM6/5/92
to
> Neither did I. But «The Family Circus», normally a terminally
>cute comic strip, did come out with what might be considered a
>mnemonic, in the form of a series of little drawings of:
>
>Super-cow, a fragile list, stick, eggs, pea, alley, dough, shush!
Cute.
> «antidisestablishmentarianism»
How about antidisestablishmentarianists?
—
Thomas Farmer | tfa…@datamark.co.nz or | Love is a bucketful
Datamark Intl Ltd | tfa…@cavebbs.welly.gen.nz | of still warm beagles.
Technical Writer | +64-4-233-8186 (work) |
& PC Wrangler | +64-4-479-6306 (home) | Share and Enjoy
Peter David THOMPSON
unread,
Jun 5, 1992, 7:40:44 AM6/5/92
to
Charli…@mindlink.bc.ca (Charlie Gibbs) writes:
>>Super-cali-fragil-istic-expli-elli-docious
>>
>>Please no ‘spelling’ flames, I never saw the movie.
> Neither did I. But «The Family Circus», normally a terminally
>cute comic strip, did come out with what might be considered a
>mnemonic, in the form of a series of little drawings of:
> A cow wearing a cape, flying through the air
> A list of delicate items of china
> A piece of a small tree branch
> A couple of eggs
> A single pea
> A back lane
> A bag of money
> Someone motioning to be quiet
The «My Word» series on BBC radio had a segment where Frank Muir and Dennis
Norden would each be asked to make up a short story that ended with a pun
on a quotation. One week’s effort by one of them ended with:
Soup
A cauli(flower)
fridge
elastic
Spilt pea
halitosis
>which was interpreted as:
>Super-cow, a fragile list, stick, eggs, pea, alley, dough, shush!
> BTW if «antidisestablishmentarianism» has a hyphen, it must
>have been an addition. The foot-thick «Webster’s Twentieth-Century
>Dictionary» that’s been in the family as long as I remember (and
>probably longer, it’s dated 1937) contains an addendum that not
>only gives a definition, but also spells it without the hyphen.
> As for more commonly used long words, I’m surprised that
>nobody has yet mentioned those darlings of the industry,
>»interoperability» and «internationalization», which are
>commonly abberviated to «i14y» and «i18n» respectively.
* These opinions belong to p…@mundil.cs.mu.OZ.AU unless otherwise specified.
«GNU Make will no longer go into an infinite loop when fed the horrid trash
that passes for makefiles that `imake’ produces (so you can compile X, despite
the extreme stubbornness and irrationality of its maintainers).»-version 3.55
Charles Lasner
unread,
Jun 5, 1992, 8:06:01 AM6/5/92
to
Someone told me once about a non-DEC PDP-8 clone design where the documentation
actually bothered to explain the -8 opcodes as 3 hex digits, not the customary
octal. He claims it even almost made some sense. Think of the indirect bit
as forming alternate opcodes when set gives you 16 instruction groups instead
of 8, and then a two nybble address to locate the operand in.
cjl (not «cjl «)
Charles Lasner
unread,
Jun 5, 1992, 8:12:05 AM6/5/92
to
But are these other songs descriptions, or are they actually a subset of the
words of the song? (The Jan & Dean song qualifies as such.)
cjl (2 drives for every controller — Computer City, here we come)
CP/M lives!
unread,
Jun 5, 1992, 1:38:19 PM6/5/92
to
> But are these other songs descriptions, or are they actually a subset of the
> words of the song? (The Jan & Dean song qualifies as such.)
«Several species….» is an instrumental. No words for it to be a subset of.
Roger Ivie
iv…@cc.usu.edu
CP/M lives!
unread,
Jun 5, 1992, 1:36:45 PM6/5/92
to
In article <1992Jun5.0…@news.columbia.edu>, las…@watsun.cc.columbia.edu (Charles Lasner) writes:
> Someone told me once about a non-DEC PDP-8 clone design where the documentation
> actually bothered to explain the -8 opcodes as 3 hex digits, not the customary
> octal. He claims it even almost made some sense. Think of the indirect bit
> as forming alternate opcodes when set gives you 16 instruction groups instead
> of 8, and then a two nybble address to locate the operand in.
The PDP-5 front panel that I picked up somewhere groups both the Indirect and
Page Zero bits in with the opcode, which could divide the instructions into
32 groups. Now if only the OPeRate divisions were made that way, we’d be
set…
Roger Ivie
iv…@cc.usu.edu
Hannu Helminen ti
unread,
Jun 5, 1992, 1:57:49 PM6/5/92
to
> In article <DM.92Ju…@stekt2.oulu.fi> d…@stekt2.oulu.fi (Hannu Helminen ti) writes:
> >What about Finnish:
> >Ep{j{rjestelm{llistytt{m{tt|myydell{ns{k{{n
> >({ is letter a with two dots over it)
> >»Ep{» is actually a prefix, but you can use the word without it.
> >Please don’t ask me to explain what it means
> Ah, but «h{{y|aie» is so much more impressive (one consonant, seven
> vowels) and just as vacuous… And don’t forget that your `k{{n’,
> `ns{‘, `ll{‘, `myys’, and `m{t|n’ are all suffixes… Pity I can’t
> figure out what it means to j{rjestelm{llistytt{{. (I think I got
> that right.)
Yes, they are suffixes, but I think they should be allowed in this
«long word» contest, since that is how many Finnish words are made.
One could make virtually arbitrarily long words by just concatenating
words together (in Finnish anyway), but the number of suffixes one can
tag on one word is always limited.
‘to j{rjestelm{llistytt{{‘ means something like ‘to order’ or rather,
‘to make something be ordered’ (with emphasis on the act on making).
It is derived from ‘j{rjestelm{llinen’ (ordered) with a suffix
So basically what the whole word means is ‘even with his/her ability
to make things be unordered’
This is folklore all right, but what does it have to do with computers
> -GAWollman
Bill Potter
unread,
Jun 5, 1992, 2:38:01 PM6/5/92
to
In article <1992Jun3.1…@desire.wright.edu>,
tha…@desire.wright.edu writes:
>
> And reagrdless of whatever it means, why wouldn’t the typical German
> come up with a shorter word? Maybe such as the (ahem) all-American
> «thingamabob»?
Ah, but they do — have you not heard of a «dingsbums»? Sorry, make that
a «Dingsbums».
=============================================================================
Bill Potter : unido!pcsbst!billp : croft — n. — a piece of land
PCS GmbH : bi…@pcsbst.pcs.com : in the Highlands surrounded
D8000 Muenchen : You can’t sink a RAINBOW : completely by regulations.
=============================================================================
Ralph ‘Hairy’ Moonen
unread,
Jun 5, 1992, 4:08:57 PM6/5/92
to
In article <1992Jun4.1…@news.columbia.edu>, las…@watsun.cc.columbia.edu (Charles Lasner) writes:
> Don’t know about the longest word, but how ’bout the longest pop song title:
>
> Jan and Dean’s
>
> Anaheim, Azusa, Cookamonga Sewing Circle, Book Review and Timing Association.
No, it’s probably:
The sound of several species of small fury animals gathered together in a cave
grooving with a pict
By Pink Floyd. All you AFU’ers of course know this song is about gerbils
—Ralph Moonen
Michael Qvortrup
unread,
Jun 5, 1992, 3:35:45 PM6/5/92
to
In article <1992Jun3.1…@desire.wright.edu> tha…@desire.wright.edu writes:
>In article <MJD.92Ju…@saul.cis.upenn.edu>,
>m…@saul.cis.upenn.edu (He that would sup with The Devil) writes:
>>
>> Perhaps I’m mistaken, but you’re not a native speaker of English,
>> are you?
>>
>> `Antidisestablishmenatrianism’ has only one stem word: `Establish’.
>> Although it’s uncommon in English to concatentate more than two stem
>> words, and we don’t have any constructions like German’s
>> `Schutzengrabenvernichtungspanzerkraftwagen’ …
>
>I’m a native English speaker (well, *American* English …) and I speak no
>German, but I am intrigued: what the *HECK* is «schutzen… (etc., et al.)»?
Well, let us split it apart and have a look at it:
Schutzengraben (should actually be Sch»u…)
trench (as in WW1 trench out at the front, see where this is leading?)
(direct translation: gunner hole)
Vernichtung
noun form of the verb vernichten meaning to destroy
Panzer
armour
Kraftwagen
automotive vehicle (common denominator for car and truck)
Panzerkraftwagen
some kind of armoured vehicle (i.e. a tank)
The word then means an armoured vehicle for the destruction of trenches.
(Could also be a destructive armoured vehicle for use in trenches).
>And reagrdless of whatever it means, why wouldn’t the typical German
>come up with a shorter word? Maybe such as the (ahem) all-American
>»thingamabob»?
You mean ‘dingsda’? That would translate roughly as ‘thingamabob’.
The average German would look just as confused when confronted by such
a word, as the English speaking person would upon being confronted with
‘flocci …’.
The above word might have been in use earlier in this century, but the
tendency to construct such words seems to have died out.
Greetings,
—Mike
—
#include <std-disclm.h>—«… and there is a small flaw in my character.»—
Real Life: Michael Christian Heide Qvortrup A Dane ETH, Zuerich
e-mail : qvor…@inf.ethz.ch abroad Switzerland
Institut fuer wissenschaftliches Rechnen / Inst. of Scientific Computation
Dinda Peter
unread,
Jun 5, 1992, 7:37:49 PM6/5/92
to
My German born mother always uses «Dings» — I was born here and prefer
«whachamacallit.»
Kenneth Crudup
unread,
Jun 5, 1992, 7:56:40 PM6/5/92
to
In article <1992May31….@ukw.uucp> lu…@ukw.uucp
(Lupe Christoph) writes:
>The longest English word in /usr/dict/words on SunOS 4.1.2 are
>counterproductive and indistinguishable (both 17 letters).
Now that you’ve described them, what are the two words?
-Kenny
—
Kenneth R. Crudup, Contractor, OSF DCE QA
OSF, 11 Cambridge Center, Cambridge, MA 02142 +1 617 621 7306
ke…@osf.osf.org OSF has nothing to do with this post.
America: Where you can be taped beating a Black man and still get away with it.
Harry Bloomberg
unread,
Jun 5, 1992, 7:45:43 PM6/5/92
to
>Do such organizations with multi-purpose names actually exist?
>
>I have heard of the Brooklyn Heights Poker and Literary Guild BTW. They were
>once peripherally involved with certain computer people, etc., a long story
>to be repeated in another post.
>
There’s an organization in the Pittsburgh area called the South Hills
Brass Pounders and Modulators Amateur Radio Club Inc.
Because brass pounding (sending Morse Code) and modulating (voice
communications) are pretty old technologies, I suggested they change
their name to the South Hills Brass Pounders, Modulators, and Digitizers
Amateur Radio Club Inc. The motion died for lack of a second.
Harry Bloomberg
h…@vms.cis.pitt.edu
Alfvaen
unread,
Jun 5, 1992, 8:45:05 PM6/5/92
to
Thomas Farmer writes
Do they do things antidisestablishmentarianistically?
—
—Alfvaen(a.k.a. Aaron V. Humphrey)
Canadian Network For Space Research, Edmonton, Alberta, Canada
Her hair spilled out like rootbeer…
Current Album—The Hooters:Nervous Night
Alfvaen
unread,
Jun 5, 1992, 8:44:30 PM6/5/92
to
Chip Olson writes
Christine Lavin has a song called «Regretting what I said» whose full title,
as said at the beginning of the song, is more like «Regretting what I said
when you called me at 3 a.m. to tell me you were going to drive to the
airport, get on a plane and go skiing in the Alps for three weeks, even
though you know I hate skiing and I couldn’t expect you to pay my way, but I
don’t like surprises.» I’ve probably mangled it horribly, since this is
from memory and I’ve only heard the song a few times…
Alfvaen
unread,
Jun 5, 1992, 8:50:31 PM6/5/92
to
CP/M lives! writes
Well, it’s not quite an instrumental; the Pict does say some things at the
end, in a Scottish-type accent. And a lot of the furry-animal sounds are
human voices, albeit somewhat modified and not really forming words. But,
on the whole, it’s fairly instrumental.
> Roger Ivie
> iv…@cc.usu.edu
Kristian Koehntopp
unread,
Jun 6, 1992, 12:15:17 AM6/6/92
to
In <1992Jun4.1…@sci.kun.nl> ha…@cs.kun.nl (Hans Mulder) writes:
>Care to explain how the word «everyday» came into being if it wasn’t by
>using the wordsruntogethermethod? Or is «everyday» a Greek loan word, too?
Yes, and please explain «folk lore» also.
Kristian
—
Kristian Koehntopp, Harmsstrasse 98, FRG W-2300 Kiel, +49 431 676689
«I heard it on talk.bizarre, so I’m sceptical …»
— aa…@space.ualberta.ca in alt.folklore.computers
Kristian Koehntopp
unread,
Jun 6, 1992, 12:17:04 AM6/6/92
to
> .ASCIZ /cjl/
>is one longword on a VAX.
Have you disassembled them? What if you execute my name on your
machine?
Charlie Gibbs
unread,
Jun 6, 1992, 2:31:45 AM6/6/92
to
In article <1992Jun5.0…@news.columbia.edu>
las…@watsun.cc.columbia.edu (Charles Lasner) writes:
>But are these other songs descriptions, or are they actually a subset of the
>words of the song? (The Jan & Dean song qualifies as such.)
Nope, they’re the actual titles as given in the list on the
album cover. I don’t have that particular Shawn Phillips album,
but I seem to recall that the liner notes abbreviated that title
to SWWFHMATSITAYKILYBBIGTHTL (or whatever) for the several times
it was mentioned. I do have a copy of Pink Floyd’s «Ummagumma»,
though. The title of «Several Species…» is definitely not a
subset of the words, because there are no words (unless you count
the nearly incomprehensible Gaelic rambling at the end, which
nonetheless ends with the words «…and the wind cried Mary»).
It makes for a crowded label.
One of Jefferson Airplane’s more obscure albums contained
a song titled «Never Argue With a German When You’re Tired or
European Song». That title did appear, more or less, in the
lyrics, except that it came out more like «Streichen Sie nicht
mit einem Deutscher wenn Sie m»ude sind.»
Charli…@mindlink.bc.ca
If your nose runs and your feet smell, you’re built upside-down.
Doug Landauer
unread,
Jun 6, 1992, 5:27:35 AM6/6/92
to
> > Don’t know about the longest word, but how ’bout the longest
> > pop song title:
^^^
> > Jan and Dean’s «Anaheim, Azusa, Cookamonga …»
>
> «The sound of several species of small fury animals …»
(is Pink Floyd «pop»?)
Christine Lavin does have one song with a rather long title. Now, I
wouldn’t presume to claim that it’s the longest pop song title, but it’s
certainly the longest one I’ve ever heard. As I recall, it’s
«Regretting what I said to you when you called me at eleven
o’clock on a Friday morning to tell me that at one
o’clock that afternoon you were going to leave the
office, go downstairs, hail a cab to go to the airport,
and fly to Europe to go skiing in the Alps for two
weeks; not that I wanted to go, I couldn’t get away
from the office, I’m not that good of a skiier and I
couldn’t expect you to pay my way but after going
out with you for three years, I *don’t* *like* *surprises*!»
[Just try to type that in one breath!]
It’s subtitled «A Musical Apology»; it’s one of my favorite
funny songs. The first few lines:
I didn’t mean it when I said, «I hope the cable …
in the elevator … snaps — when you get on board.»
And I was joking when I said, «I hope you crack your head,
and get mangled by the downstairs revolving door.»
And I was kidding when I said, «I hope the number 103 bus
hits and makes a pancake out of you.»
…
Followups to rec.music.folk …
—
Doug Landauer — land…@eng.sun.com
SMI[STE]->SunPro::Languages.PE(C++);
Hans Mulder
unread,
Jun 6, 1992, 5:00:42 AM6/6/92
to
>Super-cali-fragi-listic-ixpi-alle-docious
Super-cal-ifrag-ilistic-expi-ali-docious
>>Please no ‘spelling’ flames, I never saw the movie.
That’s OK, but you really ought to have the TeX book handy (it’s on page 450).
>Sorry, couldn’t resist.
Sorry, I couldn’t either.
—
Hans Mulder ha…@cs.kun.nl
Taed Nelson
unread,
Jun 5, 1992, 8:29:57 PM6/5/92
to
> > Anaheim, Azusa, Cookamonga Sewing Circle, Book Review and Timing Association.
> The sound of several species of small fury animals gathered together in a cave
> grooving with a pict
Clearly, you people don’t listen to folk music. There is a popular (as
popular as new folk gets, anyway) song by Christine Lavin entitled:
«Regretting what I said to you when you called me eleven o’clock on
a Friday morning to tell me that at one o’clock Friday afternoon, you’re
going to leave your office, go downstairs, hail a cab to go to the airport
to catch a plane to ski in the Alps for two weeks — not that I wanted to go
with you, I wasn’t able to leave town, I’m not a very good skiier, I
couldn’t expect you to pay my way, but after going out with you for three
years, I don’t like surprises.»
And to add even more, it’s _subtitled_ «(A musical apology)» just to make it
easier to talk about. It’s usually referred to as «Regretting what I
said…» since it’s generally frowned upon when it takes twenty minutes just
to ask someone if they’ve heard a song…
Felix Finch
unread,
Jun 6, 1992, 9:49:35 PM6/6/92
to
> The sound of several species of small fury animals gathered together in a cave
> grooving with a pict
> By Pink Floyd. All you AFU’ers of course know this song is about gerbils
You Don’t Bring Me Floriculturally Diverse Polyfragrant Soilistically
Challenged Multipetaled Victims of Pesticidal Food Chain Chauvinism
by the Capitol Steps, on ’76 Bad Loans.
—
… _._. ._ ._. . _._. ._. ___ .__ ._. . .__. ._ .. ._.
Felix Finch, scarecrow repairer / fe…@crowfix.com / uunet!crowfix!felix
Barbara Trumpinski
unread,
Jun 8, 1992, 2:22:45 AM6/8/92
to
>Clearly, you people don’t listen to folk music. There is a popular (as
> popular as new folk gets, anyway) song by Christine Lavin entitled:
>»Regretting what I said to you when you called me eleven o’clock on
> a Friday morning to tell me that at one o’clock Friday afternoon, you’re
> going to leave your office, go downstairs, hail a cab to go to the airport
> to catch a plane to ski in the Alps for two weeks — not that I wanted to go
> with you, I wasn’t able to leave town, I’m not a very good skiier, I
> couldn’t expect you to pay my way, but after going out with you for three
> years, I don’t like surprises.»
>And to add even more, it’s _subtitled_ «(A musical apology)» just to make it
> easier to talk about. It’s usually referred to as «Regretting what I
> said…» since it’s generally frowned upon when it takes twenty minutes just
> to ask someone if they’ve heard a song…
and it is a great song….
barb
—
***************************************************************************
conan the librarian a.k.a. kitten / / barbara ann
«my life’s a soap opera, isn’t yours?» {=.=}
~ trum…@alexia.lis.uiuc.edu
«everything starts as someone’s daydream.» larry niven
«to books that are the axes for the ice on our souls»
Sam Wilson
unread,
Jun 8, 1992, 4:01:50 PM6/8/92
to
rog…@rosencrantz.osf.org (Andrew Rogers) writes:
> There’s a late-60’s Tyrannosaurus Rex LP with a title something like «My
> people were fair <mumble mumble> stars in their hair but now they’re
> content <mumble mumble> on their brows».
«My people were fair and had sky in their hair but now they’re content
to wear stars on their brows.» (Hang the capitals!)
It’s actually the last 2 (or 4) lines of one of the songs (the last
song?) on the album, but I can’t remember what the song was actually
entitled. Must be 20 years…
By the way, I realise this isn’t really a contender for the ‘longest
song title’, but I’ve always had a soft spot for the Rezillos’
«(I Love My Baby Cos She Does) Good Sculptures»
Sam
tha…@desire.wright.edu
unread,
Jun 9, 1992, 1:26:03 AM6/9/92
to
And earlier «dingsda» was nominated for thingamabob. Don’t tell me: dingsbums
and dingsda are male/female or singular/plural, right?
Dings sounds good in that case!
——ted hayes
Phil Abercrombie
unread,
Jun 10, 1992, 4:00:58 AM6/10/92
to
>>>>> On 2 Jun 92 17:20:21 GMT, ev…@hpl.hp.com (Evan Kirshenbaum) said:
Evan> In any case, «antidisestablishmentarianism» still seems like the
Evan> longest (and it *is* in /usr/dict/words on HP-UX 8.0). Even if you
Evan> insist that it should be «anti-» (and I don’t believe it is/was
Evan> spelled that way consistently), «disestablishmentarianism» has 24
Evan> letters. Neither one seems to get much use these days.
I seem to recall that Heseltine was talking about the dis-establishment
of the Church of England some 8 months ago.
As is my nature, I was very much against all he proposed.
Phil
—
Phil Abercrombie | Living in another country | _~o _O | BIKE TO
aberc…@mdcbbs.com | Under another name | -<,-<, | WORK
| | (*)/===/(*) |
Charles Lasner
unread,
Jun 10, 1992, 8:09:30 AM6/10/92
to
In the ’60’s there was a song entitled «Gimme dat ding» anyone know the artist?
cjl
Mark Slagle
unread,
Jun 11, 1992, 4:28:56 AM6/11/92
to
> In the ’60’s there was a song entitled «Gimme dat ding» anyone
> know the artist?
I distinctly recall hearing Chuck Berry do the song, but I don’t
know if he is the author. Still, I thought it sounded more like
«Gimme dat ting.»
—
—-
Mark E. Slagle PO Box 61059
sla…@lmsc.lockheed.com Sunnyvale, CA 94088
408-756-0895 USA
Peter Kittel
unread,
Jun 10, 1992, 2:57:52 PM6/10/92
to
tha…@desire.wright.edu writes:
It’s neutral, singular. «Ding» is the exact translation of the english
noun «thing», the version «Dings» is slang and «Dingsbums» also.
As far as I know, there are some dialects of English who turn the «th»
into a «d», so I’m not surprised that Englishmen like this word, too.
Where I always have difficulties: Where does that «foobar» (or foo, bar)
stuff come from? It’s not a normal english word?
—
Best regards, Dr. Peter Kittel // E-Mail to \ Only my personal opinions…
Commodore Frankfurt, Germany X/ {uunet|pyramid|rutgers}!cbmvax!cbmger!peterk
or pet…@public.sub.org
This article is about the way computers organise data stored on media such as disk. For library and office filing systems, see Library classification.
In computing, a file system or filesystem (often abbreviated to fs) is a method and data structure that the operating system uses to control how data is stored and retrieved.[1] Without a file system, data placed in a storage medium would be one large body of data with no way to tell where one piece of data stopped and the next began, or where any piece of data was located when it was time to retrieve it. By separating the data into pieces and giving each piece a name, the data are easily isolated and identified. Taking its name from the way a paper-based data management system is named, each group of data is called a «file». The structure and logic rules used to manage the groups of data and their names is called a «file system.»
There are many kinds of file systems, each with unique structure and logic, properties of speed, flexibility, security, size and more. Some file systems have been designed to be used for specific applications. For example, the ISO 9660 and UDF file systems are designed specifically for optical discs.
File systems can be used on many types of storage devices using various media. As of 2019, hard disk drives have been key storage devices and are projected to remain so for the foreseeable future.[2] Other kinds of media that are used include SSDs, magnetic tapes, and optical discs. In some cases, such as with tmpfs, the computer’s main memory (random-access memory, RAM) is used to create a temporary file system for short-term use.
Some file systems are used on local data storage devices;[3] others provide file access via a network protocol (for example, NFS,[4] SMB, or 9P clients). Some file systems are «virtual», meaning that the supplied «files» (called virtual files) are computed on request (such as procfs and sysfs) or are merely a mapping into a different file system used as a backing store. The file system manages access to both the content of files and the metadata about those files. It is responsible for arranging storage space; reliability, efficiency, and tuning with regard to the physical storage medium are important design considerations.
Origin of the term[edit]
Before the advent of computers the term file system was used to describe a method of storing and retrieving paper documents.[5] By 1961, the term was being applied to computerized filing alongside the original meaning.[6] By 1964, it was in general use.[7]
Architecture[edit]
A file system consists of two or three layers. Sometimes the layers are explicitly separated, and sometimes the functions are combined.[8]
The logical file system is responsible for interaction with the user application. It provides the application program interface (API) for file operations — OPEN
, CLOSE
, READ
, etc., and passes the requested operation to the layer below it for processing. The logical file system «manage[s] open file table entries and per-process file descriptors».[9] This layer provides «file access, directory operations, [and] security and protection».[8]
The second optional layer is the virtual file system. «This interface allows support for multiple concurrent instances of physical file systems, each of which is called a file system implementation».[9]
The third layer is the physical file system. This layer is concerned with the physical operation of the storage device (e.g. disk). It processes physical blocks being read or written. It handles buffering and memory management and is responsible for the physical placement of blocks in specific locations on the storage medium. The physical file system interacts with the device drivers or with the channel to drive the storage device.[8]
Aspects of file systems[edit]
Space management[edit]
Note: this only applies to file systems used in storage devices.
An example of slack space, demonstrated with 4,096-byte NTFS clusters: 100,000 files, each five bytes per file, which equal to 500,000 bytes of actual data but require 409,600,000 bytes of disk space to store
File systems allocate space in a granular manner, usually multiple physical units on the device. The file system is responsible for organizing files and directories, and keeping track of which areas of the media belong to which file and which are not being used. For example, in Apple DOS of the early 1980s, 256-byte sectors on 140 kilobyte floppy disk used a track/sector map.[citation needed]
This results in unused space when a file is not an exact multiple of the allocation unit, sometimes referred to as slack space.[10] For a 512-byte allocation, the average unused space is 256 bytes. For 64 KB clusters, the average unused space is 32 KB. The size of the allocation unit is chosen when the file system is created. Choosing the allocation size based on the average size of the files expected to be in the file system can minimize the amount of unusable space. Frequently the default allocation may provide reasonable usage. Choosing an allocation size that is too small results in excessive overhead if the file system will contain mostly very large files.
File system fragmentation occurs when unused space or single files are not contiguous. As a file system is used, files are created, modified and deleted. When a file is created, the file system allocates space for the data. Some file systems permit or require specifying an initial space allocation and subsequent incremental allocations as the file grows. As files are deleted, the space they were allocated eventually is considered available for use by other files. This creates alternating used and unused areas of various sizes. This is free space fragmentation. When a file is created and there is not an area of contiguous space available for its initial allocation, the space must be assigned in fragments. When a file is modified such that it becomes larger, it may exceed the space initially allocated to it, another allocation must be assigned elsewhere and the file becomes fragmented.[11]
In some operating systems, a system administrator may use disk quotas to limit the allocation of disk space.
Filenames[edit]
A filename (or file name) is used to identify a storage location in the file system. Most file systems have restrictions on the length of filenames. In some file systems, filenames are not case sensitive (i.e., the names MYFILE
and myfile
refer to the same file in a directory); in others, filenames are case sensitive (i.e., the names MYFILE
, MyFile
, and myfile
refer to three separate files that are in the same directory).
Most modern file systems allow filenames to contain a wide range of characters from the Unicode character set. However, they may have restrictions on the use of certain special characters, disallowing them within filenames; those characters might be used to indicate a device, device type, directory prefix, file path separator, or file type.
Directories[edit]
File systems typically have directories (also called folders) which allow the user to group files into separate collections. This may be implemented by associating the file name with an index in a table of contents or an inode in a Unix-like file system. Directory structures may be flat (i.e. linear), or allow hierarchies where directories may contain subdirectories. The first file system to support arbitrary hierarchies of directories was used in the Multics operating system.[12] The native file systems of Unix-like systems also support arbitrary directory hierarchies, as do, for example, Apple’s Hierarchical File System and its successor HFS+ in classic Mac OS, the FAT file system in MS-DOS 2.0 and later versions of MS-DOS and in Microsoft Windows, the NTFS file system in the Windows NT family of operating systems, and the ODS-2 (On-Disk Structure-2) and higher levels of the Files-11 file system in OpenVMS.
Metadata[edit]
Other bookkeeping information is typically associated with each file within a file system. The length of the data contained in a file may be stored as the number of blocks allocated for the file or as a byte count. The time that the file was last modified may be stored as the file’s timestamp. File systems might store the file creation time, the time it was last accessed, the time the file’s metadata was changed, or the time the file was last backed up. Other information can include the file’s device type (e.g. block, character, socket, subdirectory, etc.), its owner user ID and group ID, its access permissions and other file attributes (e.g. whether the file is read-only, executable, etc.).
A file system stores all the metadata associated with the file—including the file name, the length of the contents of a file, and the location of the file in the folder hierarchy—separate from the contents of the file.
Most file systems store the names of all the files in one directory in one place—the directory table for that directory—which is often stored like any other file.
Many file systems put only some of the metadata for a file in the directory table, and the rest of the metadata for that file in a completely separate structure, such as the inode.
Most file systems also store metadata not associated with any one particular file.
Such metadata includes information about unused regions—free space bitmap, block availability map—and information about bad sectors.
Often such information about an allocation group is stored inside the allocation group itself.
Additional attributes can be associated on file systems, such as NTFS, XFS, ext2, ext3, some versions of UFS, and HFS+, using extended file attributes. Some file systems provide for user defined attributes such as the author of the document, the character encoding of a document or the size of an image.
Some file systems allow for different data collections to be associated with one file name. These separate collections may be referred to as streams or forks. Apple has long used a forked file system on the Macintosh, and Microsoft supports streams in NTFS. Some file systems maintain multiple past revisions of a file under a single file name; the filename by itself retrieves the most recent version, while prior saved version can be accessed using a special naming convention such as «filename;4» or «filename(-4)» to access the version four saves ago.
See comparison of file systems#Metadata for details on which file systems support which kinds of metadata.
File system as an abstract user interface[edit]
In some cases, a file system may not make use of a storage device but can be used to organize and represent access to any data, whether it is stored or dynamically generated (e.g. procfs).
Utilities[edit]
File systems include utilities to initialize, alter parameters of and remove an instance of the file system. Some include the ability to extend or truncate the space allocated to the file system.
Directory utilities may be used to create, rename and delete directory entries, which are also known as dentries (singular: dentry),[13] and to alter metadata associated with a directory. Directory utilities may also include capabilities to create additional links to a directory (hard links in Unix), to rename parent links («..» in Unix-like operating systems),[clarification needed] and to create bidirectional links to files.
File utilities create, list, copy, move and delete files, and alter metadata. They may be able to truncate data, truncate or extend space allocation, append to, move, and modify files in-place. Depending on the underlying structure of the file system, they may provide a mechanism to prepend to or truncate from the beginning of a file, insert entries into the middle of a file, or delete entries from a file. Utilities to free space for deleted files, if the file system provides an undelete function, also belong to this category.
Some file systems defer operations such as reorganization of free space, secure erasing of free space, and rebuilding of hierarchical structures by providing utilities to perform these functions at times of minimal activity. An example is the file system defragmentation utilities.
Some of the most important features of file system utilities are supervisory activities which may involve bypassing ownership or direct access to the underlying device. These include high-performance backup and recovery, data replication, and reorganization of various data structures and allocation tables within the file system.
Restricting and permitting access[edit]
There are several mechanisms used by file systems to control access to data. Usually the intent is to prevent reading or modifying files by a user or group of users. Another reason is to ensure data is modified in a controlled way so access may be restricted to a specific program. Examples include passwords stored in the metadata of the file or elsewhere and file permissions in the form of permission bits, access control lists, or capabilities. The need for file system utilities to be able to access the data at the media level to reorganize the structures and provide efficient backup usually means that these are only effective for polite users but are not effective against intruders.
Methods for encrypting file data are sometimes included in the file system. This is very effective since there is no need for file system utilities to know the encryption seed to effectively manage the data. The risks of relying on encryption include the fact that an attacker can copy the data and use brute force to decrypt the data. Additionally, losing the seed means losing the data.
Maintaining integrity[edit]
One significant responsibility of a file system is to ensure that the file system structures in secondary storage remain consistent, regardless of the actions by programs accessing the file system. This includes actions taken if a program modifying the file system terminates abnormally or neglects to inform the file system that it has completed its activities. This may include updating the metadata, the directory entry and handling any data that was buffered but not yet updated on the physical storage media.
Other failures which the file system must deal with include media failures or loss of connection to remote systems.
In the event of an operating system failure or «soft» power failure, special routines in the file system must be invoked similar to when an individual program fails.
The file system must also be able to correct damaged structures. These may occur as a result of an operating system failure for which the OS was unable to notify the file system, a power failure, or a reset.
The file system must also record events to allow analysis of systemic issues as well as problems with specific files or directories.
User data[edit]
The most important purpose of a file system is to manage user data. This includes storing, retrieving and updating data.
Some file systems accept data for storage as a stream of bytes which are collected and stored in a manner efficient for the media. When a program retrieves the data, it specifies the size of a memory buffer and the file system transfers data from the media to the buffer. A runtime library routine may sometimes allow the user program to define a record based on a library call specifying a length. When the user program reads the data, the library retrieves data via the file system and returns a record.
Some file systems allow the specification of a fixed record length which is used for all writes and reads. This facilitates locating the nth record as well as updating records.
An identification for each record, also known as a key, makes for a more sophisticated file system. The user program can read, write and update records without regard to their location. This requires complicated management of blocks of media usually separating key blocks and data blocks. Very efficient algorithms can be developed with pyramid structures for locating records.[14]
Using a file system[edit]
Utilities, language specific run-time libraries and user programs use file system APIs to make requests of the file system. These include data transfer, positioning, updating metadata, managing directories, managing access specifications, and removal.
Multiple file systems within a single system[edit]
Frequently, retail systems are configured with a single file system occupying the entire storage device.
Another approach is to partition the disk so that several file systems with different attributes can be used. One file system, for use as browser cache or email storage, might be configured with a small allocation size. This keeps the activity of creating and deleting files typical of browser activity in a narrow area of the disk where it will not interfere with other file allocations. Another partition might be created for the storage of audio or video files with a relatively large block size. Yet another may normally be set read-only and only periodically be set writable.
A third approach, which is mostly used in cloud systems, is to use «disk images» to house additional file systems, with the same attributes or not, within another (host) file system as a file. A common example is virtualization: one user can run an experimental Linux distribution (using the ext4 file system) in a virtual machine under his/her production Windows environment (using NTFS). The ext4 file system resides in a disk image, which is treated as a file (or multiple files, depending on the hypervisor and settings) in the NTFS host file system.
Having multiple file systems on a single system has the additional benefit that in the event of a corruption of a single partition, the remaining file systems will frequently still be intact. This includes virus destruction of the system partition or even a system that will not boot. File system utilities which require dedicated access can be effectively completed piecemeal. In addition, defragmentation may be more effective. Several system maintenance utilities, such as virus scans and backups, can also be processed in segments. For example, it is not necessary to backup the file system containing videos along with all the other files if none have been added since the last backup. As for the image files, one can easily «spin off» differential images which contain only «new» data written to the master (original) image. Differential images can be used for both safety concerns (as a «disposable» system — can be quickly restored if destroyed or contaminated by a virus, as the old image can be removed and a new image can be created in matter of seconds, even without automated procedures) and quick virtual machine deployment (since the differential images can be quickly spawned using a script in batches).
Design limitations[edit]
All file systems have some functional limit that defines the maximum storable data capacity within that system. These functional limits are a best-guess effort by the designer based on how large the storage systems are right now and how large storage systems are likely to become in the future. Disk storage has continued to increase at near exponential rates (see Moore’s law), so after a few years, file systems have kept reaching design limitations that require computer users to repeatedly move to a newer system with ever-greater capacity.
File system complexity typically varies proportionally with the available storage capacity. The file systems of early 1980s home computers with 50 KB to 512 KB of storage would not be a reasonable choice for modern storage systems with hundreds of gigabytes of capacity. Likewise, modern file systems would not be a reasonable choice for these early systems, since the complexity of modern file system structures would quickly consume or even exceed the very limited capacity of the early storage systems.
Types of file systems[edit]
File system types can be classified into disk/tape file systems, network file systems and special-purpose file systems.
Disk file systems[edit]
A disk file system takes advantages of the ability of disk storage media to randomly address data in a short amount of time. Additional considerations include the speed of accessing data following that initially requested and the anticipation that the following data may also be requested. This permits multiple users (or processes) access to various data on the disk without regard to the sequential location of the data. Examples include FAT (FAT12, FAT16, FAT32), exFAT, NTFS, ReFS, HFS and HFS+, HPFS, APFS, UFS, ext2, ext3, ext4, XFS, btrfs, Files-11, Veritas File System, VMFS, ZFS, ReiserFS and ScoutFS. Some disk file systems are journaling file systems or versioning file systems.
Optical discs[edit]
ISO 9660 and Universal Disk Format (UDF) are two common formats that target Compact Discs, DVDs and Blu-ray discs. Mount Rainier is an extension to UDF supported since 2.6 series of the Linux kernel and since Windows Vista that facilitates rewriting to DVDs.
Flash file systems[edit]
A flash file system considers the special abilities, performance and restrictions of flash memory devices. Frequently a disk file system can use a flash memory device as the underlying storage media, but it is much better to use a file system specifically designed for a flash device.[15]
Tape file systems[edit]
A tape file system is a file system and tape format designed to store files on tape. Magnetic tapes are sequential storage media with significantly longer random data access times than disks, posing challenges to the creation and efficient management of a general-purpose file system.
In a disk file system there is typically a master file directory, and a map of used and free data regions. Any file additions, changes, or removals require updating the directory and the used/free maps. Random access to data regions is measured in milliseconds so this system works well for disks.
Tape requires linear motion to wind and unwind potentially very long reels of media. This tape motion may take several seconds to several minutes to move the read/write head from one end of the tape to the other.
Consequently, a master file directory and usage map can be extremely slow and inefficient with tape. Writing typically involves reading the block usage map to find free blocks for writing, updating the usage map and directory to add the data, and then advancing the tape to write the data in the correct spot. Each additional file write requires updating the map and directory and writing the data, which may take several seconds to occur for each file.
Tape file systems instead typically allow for the file directory to be spread across the tape intermixed with the data, referred to as streaming, so that time-consuming and repeated tape motions are not required to write new data.
However, a side effect of this design is that reading the file directory of a tape usually requires scanning the entire tape to read all the scattered directory entries. Most data archiving software that works with tape storage will store a local copy of the tape catalog on a disk file system, so that adding files to a tape can be done quickly without having to rescan the tape media. The local tape catalog copy is usually discarded if not used for a specified period of time, at which point the tape must be re-scanned if it is to be used in the future.
IBM has developed a file system for tape called the Linear Tape File System. The IBM implementation of this file system has been released as the open-source IBM Linear Tape File System — Single Drive Edition (LTFS-SDE) product. The Linear Tape File System uses a separate partition on the tape to record the index meta-data, thereby avoiding the problems associated with scattering directory entries across the entire tape.
Tape formatting[edit]
Writing data to a tape, erasing, or formatting a tape is often a significantly time-consuming process and can take several hours on large tapes.[a] With many data tape technologies it is not necessary to format the tape before over-writing new data to the tape. This is due to the inherently destructive nature of overwriting data on sequential media.
Because of the time it can take to format a tape, typically tapes are pre-formatted so that the tape user does not need to spend time preparing each new tape for use. All that is usually necessary is to write an identifying media label to the tape before use, and even this can be automatically written by software when a new tape is used for the first time.
Database file systems[edit]
Another concept for file management is the idea of a database-based file system. Instead of, or in addition to, hierarchical structured management, files are identified by their characteristics, like type of file, topic, author, or similar rich metadata.[16]
IBM DB2 for i [17] (formerly known as DB2/400 and DB2 for i5/OS) is a database file system as part of the object based IBM i[18] operating system (formerly known as OS/400 and i5/OS), incorporating a single level store and running on IBM Power Systems (formerly known as AS/400 and iSeries), designed by Frank G. Soltis IBM’s former chief scientist for IBM i. Around 1978 to 1988 Frank G. Soltis and his team at IBM Rochester have successfully designed and applied technologies like the database file system where others like Microsoft later failed to accomplish.[19] These technologies are informally known as ‘Fortress Rochester’[citation needed] and were in few basic aspects extended from early Mainframe technologies but in many ways more advanced from a technological perspective[citation needed].
Some other projects that aren’t «pure» database file systems but that use some aspects of a database file system:
- Many Web content management systems use a relational DBMS to store and retrieve files. For example, XHTML files are stored as XML or text fields, while image files are stored as blob fields; SQL SELECT (with optional XPath) statements retrieve the files, and allow the use of a sophisticated logic and more rich information associations than «usual file systems.» Many CMSs also have the option of storing only metadata within the database, with the standard filesystem used to store the content of files.
- Very large file systems, embodied by applications like Apache Hadoop and Google File System, use some database file system concepts.
Transactional file systems[edit]
Some programs need to either make multiple file system changes, or, if one or more of the changes fail for any reason, make none of the changes. For example, a program which is installing or updating software may write executables, libraries, and/or configuration files. If some of the writing fails and the software is left partially installed or updated, the software may be broken or unusable. An incomplete update of a key system utility, such as the command shell, may leave the entire system in an unusable state.
Transaction processing introduces the atomicity guarantee, ensuring that operations inside of a transaction are either all committed or the transaction can be aborted and the system discards all of its partial results. This means that if there is a crash or power failure, after recovery, the stored state will be consistent. Either the software will be completely installed or the failed installation will be completely rolled back, but an unusable partial install will not be left on the system. Transactions also provide the isolation guarantee[clarification needed], meaning that operations within a transaction are hidden from other threads on the system until the transaction commits, and that interfering operations on the system will be properly serialized with the transaction.
Windows, beginning with Vista, added transaction support to NTFS, in a feature called Transactional NTFS, but its use is now discouraged.[20] There are a number of research prototypes of transactional file systems for UNIX systems, including the Valor file system,[21] Amino,[22] LFS,[23] and a transactional ext3 file system on the TxOS kernel,[24] as well as transactional file systems targeting embedded systems, such as TFFS.[25]
Ensuring consistency across multiple file system operations is difficult, if not impossible, without file system transactions. File locking can be used as a concurrency control mechanism for individual files, but it typically does not protect the directory structure or file metadata. For instance, file locking cannot prevent TOCTTOU race conditions on symbolic links.
File locking also cannot automatically roll back a failed operation, such as a software upgrade; this requires atomicity.
Journaling file systems is one technique used to introduce transaction-level consistency to file system structures. Journal transactions are not exposed to programs as part of the OS API; they are only used internally to ensure consistency at the granularity of a single system call.
Data backup systems typically do not provide support for direct backup of data stored in a transactional manner, which makes the recovery of reliable and consistent data sets difficult. Most backup software simply notes what files have changed since a certain time, regardless of the transactional state shared across multiple files in the overall dataset. As a workaround, some database systems simply produce an archived state file containing all data up to that point, and the backup software only backs that up and does not interact directly with the active transactional databases at all. Recovery requires separate recreation of the database from the state file after the file has been restored by the backup software.
Network file systems[edit]
A network file system is a file system that acts as a client for a remote file access protocol, providing access to files on a server. Programs using local interfaces can transparently create, manage and access hierarchical directories and files in remote network-connected computers. Examples of network file systems include clients for the NFS, AFS, SMB protocols, and file-system-like clients for FTP and WebDAV.
Shared disk file systems[edit]
A shared disk file system is one in which a number of machines (usually servers) all have access to the same external disk subsystem (usually a storage area network). The file system arbitrates access to that subsystem, preventing write collisions.[26] Examples include GFS2 from Red Hat, GPFS, now known as Spectrum Scale, from IBM, SFS from DataPlow, CXFS from SGI, StorNext from Quantum Corporation and ScoutFS from Versity.
Special file systems [edit]
A special file system presents non-file elements of an operating system as files so they can be acted on using file system APIs. This is most commonly done in Unix-like operating systems, but devices are given file names in some non-Unix-like operating systems as well.
Device file systems [edit]
A device file system represents I/O devices and pseudo-devices as files, called device files. Examples in Unix-like systems include devfs and, in Linux 2.6 systems, udev. In non-Unix-like systems, such as TOPS-10 and other operating systems influenced by it, where the full filename or pathname of a file can include a device prefix, devices other than those containing file systems are referred to by a device prefix specifying the device, without anything following it.
Other special file systems[edit]
- In the Linux kernel, configfs and sysfs provide files that can be used to query the kernel for information and configure entities in the kernel.
- procfs maps processes and, on Linux, other operating system structures into a filespace.
Minimal file system / audio-cassette storage[edit]
In the 1970s disk and digital tape devices were too expensive for some early microcomputer users. An inexpensive basic data storage system was devised that used common audio cassette tape.
When the system needed to write data, the user was notified to press «RECORD» on the cassette recorder, then press «RETURN» on the keyboard to notify the system that the cassette recorder was recording. The system wrote a sound to provide time synchronization, then modulated sounds that encoded a prefix, the data, a checksum and a suffix. When the system needed to read data, the user was instructed to press «PLAY» on the cassette recorder. The system would listen to the sounds on the tape waiting until a burst of sound could be recognized as the synchronization. The system would then interpret subsequent sounds as data. When the data read was complete, the system would notify the user to press «STOP» on the cassette recorder. It was primitive, but it (mostly) worked. Data was stored sequentially, usually in an unnamed format, although some systems (such as the Commodore PET series of computers) did allow the files to be named. Multiple sets of data could be written and located by fast-forwarding the tape and observing at the tape counter to find the approximate start of the next data region on the tape. The user might have to listen to the sounds to find the right spot to begin playing the next data region. Some implementations even included audible sounds interspersed with the data.
Flat file systems[edit]
In a flat file system, there are no subdirectories; directory entries for all files are stored in a single directory.
When floppy disk media was first available this type of file system was adequate due to the relatively small amount of data space available. CP/M machines featured a flat file system, where files could be assigned to one of 16 user areas and generic file operations narrowed to work on one instead of defaulting to work on all of them. These user areas were no more than special attributes associated with the files; that is, it was not necessary to define specific quota for each of these areas and files could be added to groups for as long as there was still free storage space on the disk. The early Apple Macintosh also featured a flat file system, the Macintosh File System. It was unusual in that the file management program (Macintosh Finder) created the illusion of a partially hierarchical filing system on top of EMFS. This structure required every file to have a unique name, even if it appeared to be in a separate folder. IBM DOS/360 and OS/360 store entries for all files on a disk pack (volume) in a directory on the pack called a Volume Table of Contents (VTOC).
While simple, flat file systems become awkward as the number of files grows and makes it difficult to organize data into related groups of files.
A recent addition to the flat file system family is Amazon’s S3, a remote storage service, which is intentionally simplistic to allow users the ability to customize how their data is stored. The only constructs are buckets (imagine a disk drive of unlimited size) and objects (similar, but not identical to the standard concept of a file). Advanced file management is allowed by being able to use nearly any character (including ‘/’) in the object’s name, and the ability to select subsets of the bucket’s content based on identical prefixes.
File systems and operating systems[edit]
Many operating systems include support for more than one file system. Sometimes the OS and the file system are so tightly interwoven that it is difficult to separate out file system functions.
There needs to be an interface provided by the operating system software between the user and the file system. This interface can be textual (such as provided by a command line interface, such as the Unix shell, or OpenVMS DCL) or graphical (such as provided by a graphical user interface, such as file browsers). If graphical, the metaphor of the folder, containing documents, other files, and nested folders is often used (see also: directory and folder).
Unix and Unix-like operating systems[edit]
Unix-like operating systems create a virtual file system, which makes all the files on all the devices appear to exist in a single hierarchy. This means, in those systems, there is one root directory, and every file existing on the system is located under it somewhere. Unix-like systems can use a RAM disk or network shared resource as its root directory.
Unix-like systems assign a device name to each device, but this is not how the files on that device are accessed. Instead, to gain access to files on another device, the operating system must first be informed where in the directory tree those files should appear. This process is called mounting a file system. For example, to access the files on a CD-ROM, one must tell the operating system «Take the file system from this CD-ROM and make it appear under such-and-such directory.» The directory given to the operating system is called the mount point – it might, for example, be /media. The /media directory exists on many Unix systems (as specified in the Filesystem Hierarchy Standard) and is intended specifically for use as a mount point for removable media such as CDs, DVDs, USB drives or floppy disks. It may be empty, or it may contain subdirectories for mounting individual devices. Generally, only the administrator (i.e. root user) may authorize the mounting of file systems.
Unix-like operating systems often include software and tools that assist in the mounting process and provide it new functionality. Some of these strategies have been coined «auto-mounting» as a reflection of their purpose.
- In many situations, file systems other than the root need to be available as soon as the operating system has booted. All Unix-like systems therefore provide a facility for mounting file systems at boot time. System administrators define these file systems in the configuration file fstab (vfstab in Solaris), which also indicates options and mount points.
- In some situations, there is no need to mount certain file systems at boot time, although their use may be desired thereafter. There are some utilities for Unix-like systems that allow the mounting of predefined file systems upon demand.
- Removable media allow programs and data to be transferred between machines without a physical connection. Common examples include USB flash drives, CD-ROMs, and DVDs. Utilities have therefore been developed to detect the presence and availability of a medium and then mount that medium without any user intervention.
- Progressive Unix-like systems have also introduced a concept called supermounting; see, for example, the Linux supermount-ng project. For example, a floppy disk that has been supermounted can be physically removed from the system. Under normal circumstances, the disk should have been synchronized and then unmounted before its removal. Provided synchronization has occurred, a different disk can be inserted into the drive. The system automatically notices that the disk has changed and updates the mount point contents to reflect the new medium.
- An automounter will automatically mount a file system when a reference is made to the directory atop which it should be mounted. This is usually used for file systems on network servers, rather than relying on events such as the insertion of media, as would be appropriate for removable media.
Linux[edit]
Linux supports numerous file systems, but common choices for the system disk on a block device include the ext* family (ext2, ext3 and ext4), XFS, JFS, and btrfs. For raw flash without a flash translation layer (FTL) or Memory Technology Device (MTD), there are UBIFS, JFFS2 and YAFFS, among others. SquashFS is a common compressed read-only file system.
Solaris[edit]
Solaris in earlier releases defaulted to (non-journaled or non-logging) UFS for bootable and supplementary file systems. Solaris defaulted to, supported, and extended UFS.
Support for other file systems and significant enhancements were added over time, including Veritas Software Corp. (journaling) VxFS, Sun Microsystems (clustering) QFS, Sun Microsystems (journaling) UFS, and Sun Microsystems (open source, poolable, 128 bit compressible, and error-correcting) ZFS.
Kernel extensions were added to Solaris to allow for bootable Veritas VxFS operation. Logging or journaling was added to UFS in Sun’s Solaris 7. Releases of Solaris 10, Solaris Express, OpenSolaris, and other open source variants of the Solaris operating system later supported bootable ZFS.
Logical Volume Management allows for spanning a file system across multiple devices for the purpose of adding redundancy, capacity, and/or throughput. Legacy environments in Solaris may use Solaris Volume Manager (formerly known as Solstice DiskSuite). Multiple operating systems (including Solaris) may use Veritas Volume Manager. Modern Solaris based operating systems eclipse the need for volume management through leveraging virtual storage pools in ZFS.
macOS[edit]
macOS (formerly Mac OS X) uses the Apple File System (APFS), which in 2017 replaced a file system inherited from classic Mac OS called HFS Plus (HFS+). Apple also uses the term «Mac OS Extended» for HFS+.[27] HFS Plus is a metadata-rich and case-preserving but (usually) case-insensitive file system. Due to the Unix roots of macOS, Unix permissions were added to HFS Plus. Later versions of HFS Plus added journaling to prevent corruption of the file system structure and introduced a number of optimizations to the allocation algorithms in an attempt to defragment files automatically without requiring an external defragmenter.
Filenames can be up to 255 characters. HFS Plus uses Unicode to store filenames. On macOS, the filetype can come from the type code, stored in file’s metadata, or the filename extension.
HFS Plus has three kinds of links: Unix-style hard links, Unix-style symbolic links, and aliases. Aliases are designed to maintain a link to their original file even if they are moved or renamed; they are not interpreted by the file system itself, but by the File Manager code in userland.
macOS 10.13 High Sierra, which was announced on June 5, 2017 at Apple’s WWDC event, uses the Apple File System on solid-state drives.
macOS also supported the UFS file system, derived from the BSD Unix Fast File System via NeXTSTEP. However, as of Mac OS X Leopard, macOS could no longer be installed on a UFS volume, nor can a pre-Leopard system installed on a UFS volume be upgraded to Leopard.[28] As of Mac OS X Lion UFS support was completely dropped.
Newer versions of macOS are capable of reading and writing to the legacy FAT file systems (16 and 32) common on Windows. They are also capable of reading the newer NTFS file systems for Windows. In order to write to NTFS file systems on macOS versions prior to Mac OS X Snow Leopard third party software is necessary. Mac OS X 10.6 (Snow Leopard) and later allow writing to NTFS file systems, but only after a non-trivial system setting change (third party software exists that automates this).[29]
Finally, macOS supports reading and writing of the exFAT file system since Mac OS X Snow Leopard, starting from version 10.6.5.[30]
OS/2[edit]
OS/2 1.2 introduced the High Performance File System (HPFS). HPFS supports mixed case file names in different code pages, long file names (255 characters), more efficient use of disk space, an architecture that keeps related items close to each other on the disk volume, less fragmentation of data, extent-based space allocation, a B+ tree structure for directories, and the root directory located at the midpoint of the disk, for faster average access. A journaled filesystem (JFS) was shipped in 1999.
PC-BSD[edit]
PC-BSD is a desktop version of FreeBSD, which inherits FreeBSD’s ZFS support, similarly to FreeNAS. The new graphical installer of PC-BSD can handle / (root) on ZFS and RAID-Z pool installs and disk encryption using Geli right from the start in an easy convenient (GUI) way. The current PC-BSD 9.0+ ‘Isotope Edition’ has ZFS filesystem version 5 and ZFS storage pool version 28.
Plan 9[edit]
Plan 9 from Bell Labs treats everything as a file and accesses all objects as a file would be accessed (i.e., there is no ioctl or mmap): networking, graphics, debugging, authentication, capabilities, encryption, and other services are accessed via I/O operations on file descriptors. The 9P protocol removes the difference between local and remote files. File systems in Plan 9 are organized with the help of private, per-process namespaces, allowing each process to have a different view of the many file systems that provide resources in a distributed system.
The Inferno operating system shares these concepts with Plan 9.
Microsoft Windows[edit]
Directory listing in a Windows command shell
Windows makes use of the FAT, NTFS, exFAT, Live File System and ReFS file systems (the last of these is only supported and usable in Windows Server 2012, Windows Server 2016, Windows 8, Windows 8.1, and Windows 10; Windows cannot boot from it).
Windows uses a drive letter abstraction at the user level to distinguish one disk or partition from another. For example, the path C:WINDOWS represents a directory WINDOWS on the partition represented by the letter C. Drive C: is most commonly used for the primary hard disk drive partition, on which Windows is usually installed and from which it boots. This «tradition» has become so firmly ingrained that bugs exist in many applications which make assumptions that the drive that the operating system is installed on is C. The use of drive letters, and the tradition of using «C» as the drive letter for the primary hard disk drive partition, can be traced to MS-DOS, where the letters A and B were reserved for up to two floppy disk drives. This in turn derived from CP/M in the 1970s, and ultimately from IBM’s CP/CMS of 1967.
FAT[edit]
The family of FAT file systems is supported by almost all operating systems for personal computers, including all versions of Windows and MS-DOS/PC DOS, OS/2, and DR-DOS. (PC DOS is an OEM version of MS-DOS, MS-DOS was originally based on SCP’s 86-DOS. DR-DOS was based on Digital Research’s Concurrent DOS, a successor of CP/M-86.) The FAT file systems are therefore well-suited as a universal exchange format between computers and devices of most any type and age.
The FAT file system traces its roots back to an (incompatible) 8-bit FAT precursor in Standalone Disk BASIC and the short-lived MDOS/MIDAS project.[citation needed]
Over the years, the file system has been expanded from FAT12 to FAT16 and FAT32. Various features have been added to the file system including subdirectories, codepage support, extended attributes, and long filenames. Third parties such as Digital Research have incorporated optional support for deletion tracking, and volume/directory/file-based multi-user security schemes to support file and directory passwords and permissions such as read/write/execute/delete access rights. Most of these extensions are not supported by Windows.
The FAT12 and FAT16 file systems had a limit on the number of entries in the root directory of the file system and had restrictions on the maximum size of FAT-formatted disks or partitions.
FAT32 addresses the limitations in FAT12 and FAT16, except for the file size limit of close to 4 GB, but it remains limited compared to NTFS.
FAT12, FAT16 and FAT32 also have a limit of eight characters for the file name, and three characters for the extension (such as .exe). This is commonly referred to as the 8.3 filename limit. VFAT, an optional extension to FAT12, FAT16 and FAT32, introduced in Windows 95 and Windows NT 3.5, allowed long file names (LFN) to be stored in the FAT file system in a backwards compatible fashion.
NTFS[edit]
Main article: NTFS
NTFS, introduced with the Windows NT operating system in 1993, allowed ACL-based permission control. Other features also supported by NTFS include hard links, multiple file streams, attribute indexing, quota tracking, sparse files, encryption, compression, and reparse points (directories working as mount-points for other file systems, symlinks, junctions, remote storage links).
exFAT[edit]
exFAT has certain advantages over NTFS with regard to file system overhead.[citation needed]
exFAT is not backward compatible with FAT file systems such as FAT12, FAT16 or FAT32. The file system is supported with newer Windows systems, such as Windows XP, Windows Server 2003, Windows Vista, Windows 2008, Windows 7, Windows 8, Windows 8.1, Windows 10 and Windows 11.
exFAT is supported in macOS starting with version 10.6.5 (Snow Leopard).[30] Support in other operating systems is sparse since implementing support for exFAT requires a license. exFAT is the only file system that is fully supported on both macOS and Windows that can hold files larger than 4 GB.[31][32]
OpenVMS[edit]
MVS[edit]
Prior to the introduction of VSAM, OS/360 systems implemented a hybrid file system. The system was designed to easily support removable disk packs, so the information relating to all files on one disk (volume in IBM terminology) is stored on that disk in a flat system file called the Volume Table of Contents (VTOC). The VTOC stores all metadata for the file. Later a hierarchical directory structure was imposed with the introduction of the System Catalog, which can optionally catalog files (datasets) on resident and removable volumes. The catalog only contains information to relate a dataset to a specific volume. If the user requests access to a dataset on an offline volume, and they have suitable privileges, the system will attempt to mount the required volume. Cataloged and non-cataloged datasets can still be accessed using information in the VTOC, bypassing the catalog, if the required volume id is provided to the OPEN request. Still later the VTOC was indexed to speed up access.
Conversational Monitor System[edit]
The IBM Conversational Monitor System (CMS) component of VM/370 uses a separate flat file system for each virtual disk (minidisk). File data and control information are scattered and intermixed. The anchor is a record called the Master File Directory (MFD), always located in the fourth block on the disk. Originally CMS used fixed-length 800-byte blocks, but later versions used larger size blocks up to 4K. Access to a data record requires two levels of indirection, where the file’s directory entry (called a File Status Table (FST) entry) points to blocks containing a list of addresses of the individual records.
AS/400 file system[edit]
Data on the AS/400 and its successors consists of system objects mapped into the system virtual address space in a single-level store. Many types of objects are defined including the directories and files found in other file systems. File objects, along with other types of objects, form the basis of the AS/400’s support for an integrated relational database.
Other file systems[edit]
- The Prospero File System is a file system based on the Virtual System Model.[33] The system was created by Dr. B. Clifford Neuman of the Information Sciences Institute at the University of Southern California.
- RSRE FLEX file system — written in ALGOL 68
- The file system of the Michigan Terminal System (MTS) is interesting because: (i) it provides «line files» where record lengths and line numbers are associated as metadata with each record in the file, lines can be added, replaced, updated with the same or different length records, and deleted anywhere in the file without the need to read and rewrite the entire file; (ii) using program keys files may be shared or permitted to commands and programs in addition to users and groups; and (iii) there is a comprehensive file locking mechanism that protects both the file’s data and its metadata.[34][35]
Limitations[edit]
Converting the type of a file system[edit]
It may be advantageous or necessary to have files in a different file system than they currently exist. Reasons include the need for an increase in the space requirements beyond the limits of the current file system. The depth of path may need to be increased beyond the restrictions of the file system. There may be performance or reliability considerations. Providing access to another operating system which does not support the existing file system is another reason.
In-place conversion[edit]
In some cases conversion can be done in-place, although migrating the file system is more conservative, as it involves a creating a copy of the data and is recommended.[36] On Windows, FAT and FAT32 file systems can be converted to NTFS via the convert.exe utility, but not the reverse.[36] On Linux, ext2 can be converted to ext3 (and converted back), and ext3 can be converted to ext4 (but not back),[37] and both ext3 and ext4 can be converted to btrfs, and converted back until the undo information is deleted.[38] These conversions are possible due to using the same format for the file data itself, and relocating the metadata into empty space, in some cases using sparse file support.[38]
Migrating to a different file system[edit]
Migration has the disadvantage of requiring additional space although it may be faster. The best case is if there is unused space on media which will contain the final file system.
For example, to migrate a FAT32 file system to an ext2 file system. First create a new ext2 file system, then copy the data to the file system, then delete the FAT32 file system.
An alternative, when there is not sufficient space to retain the original file system until the new one is created, is to use a work area (such as a removable media). This takes longer but a backup of the data is a nice side effect.
Long file paths and long file names[edit]
In hierarchical file systems, files are accessed by means of a path that is a branching list of directories containing the file. Different file systems have different limits on the depth of the path. File systems also have a limit on the length of an individual filename.
Copying files with long names or located in paths of significant depth from one file system to another may cause undesirable results. This depends on how the utility doing the copying handles the discrepancy.
See also[edit]
- Comparison of file systems
- Disk quota
- List of file systems
- List of Unix commands
- Directory structure
- Shared resource
- Distributed file system
- Distributed Data Management Architecture
- File manager
- File system fragmentation
- Filename extension
- Global file system
- Object storage
- Computer data storage
- Storage efficiency
- Virtual file system
Notes[edit]
- ^ An LTO-6 2.5 TB tape requires more than 4 hours to write at 160 MB/Sec
References[edit]
- ^ «5.10. Filesystems». The Linux Document Project. Retrieved December 11, 2021.
A filesystem is the methods and data structures that an operating system uses to keep track of files on a disk or partition; that is, the way the files are organized on the disk.
- ^ «Storage, IT Technology and Markets, Status and Evolution» (PDF). September 20, 2018.
HDD still key storage for the foreseeable future, SSDs not cost effective for capacity
- ^ Arpaci-Dusseau, Remzi H.; Arpaci-Dusseau, Andrea C. (2014), File System Implementation (PDF), Arpaci-Dusseau Books
- ^ Arpaci-Dusseau, Remzi H.; Arpaci-Dusseau, Andrea C. (2014), Sun’s Network File System (PDF), Arpaci-Dusseau Books
- ^ McGill, Florence E. (1922). Office Practice and Business Procedure. Gregg Publishing Company. p. 197. Retrieved August 1, 2016.
- ^ Waring, R.L. (1961). Technical investigations of addition of a hardcopy output to the elements of a mechanized library system : final report, 20 Sept. 1961. Cincinnati, OH: Svco Corporation. OCLC 310795767.
- ^ Disc File Applications: Reports Presented at the Nation’s First Disc File Symposium. American Data Processing. 1964. Retrieved August 1, 2016.
- ^ a b c Amir, Yair. «Operating Systems 600.418 The File System». Department of Computer Science Johns Hopkins University. Retrieved July 31, 2016.
- ^ a b IBM Corporation. «Component Structure of the Logical File System». IBM Knowledge Center. Retrieved July 31, 2016.
- ^ Carrier 2005, pp. 187–188.
- ^ Valvano, Jonathan W. (2011). Embedded Microcomputer Systems: Real Time Interfacing (Third ed.). Cengage Learning. p. 524. ISBN 978-1-111-42625-5. Retrieved June 30, 2022.
- ^ R. C. Daley; P. G. Neumann (1965). «A General-Purpose File System For Secondary Storage». Proceedings of the November 30—December 1, 1965, fall joint computer conference, Part I on XX — AFIPS ’65 (Fall, part I). Fall Joint Computer Conference. AFIPS. pp. 213–229. doi:10.1145/1463891.1463915. Retrieved 2011-07-30.
- ^ Mohan, I. Chandra (2013). Operating Systems. Delhi: PHI Learning Pvt. Ltd. p. 166. ISBN 9788120347267. Retrieved 2014-07-27.
The word dentry is short for ‘directory entry’. A dentry is nothing but a specific component in the path from the root. They (directory name or file name) provide for accessing files or directories[.]
- ^ «KSAM: A B + -tree-based keyed sequential-access method». ResearchGate. Retrieved 29 April 2016.
- ^ Douglis, Fred; Cáceres, Ramón; Kaashoek, M. Frans; Krishnan, P.; Li, Kai; Marsh, Brian; Tauber, Joshua (1994). «18. Storage Alternatives for Mobile Computers». Mobile Computing. Vol. 353. USENIX. pp. 473–505. doi:10.1007/978-0-585-29603-6_18. ISBN 978-0-585-29603-6. S2CID 2441760.
- ^ «Windows on a database – sliced and diced by BeOS vets». theregister.co.uk. 2002-03-29. Retrieved 2014-02-07.
- ^ «IBM DB2 for i: Overview». 03.ibm.com. Archived from the original on 2013-08-02. Retrieved 2014-02-07.
- ^ «IBM developerWorks : New to IBM i». Ibm.com. 2011-03-08. Retrieved 2014-02-07.
- ^ «XP successor Longhorn goes SQL, P2P – Microsoft leaks». theregister.co.uk. 2002-01-28. Retrieved 2014-02-07.
- ^ «Alternatives to using Transactional NTFS (Windows)». Msdn.microsoft.com. 2013-12-05. Retrieved 2014-02-07.
- ^ Spillane, Richard; Gaikwad, Sachin; Chinni, Manjunath; Zadok, Erez; Wright, Charles P. (2009). Enabling transactional file access via lightweight kernel extensions (PDF). Seventh USENIX Conference on File and Storage Technologies (FAST 2009).
- ^ Wright, Charles P.; Spillane, Richard; Sivathanu, Gopalan; Zadok, Erez (2007). «Extending ACID Semantics to the File System» (PDF). ACM Transactions on Storage. 3 (2): 4. doi:10.1145/1242520.1242521. S2CID 8939577.
- ^ Seltzer, Margo I. (1993). «Transaction Support in a Log-Structured File System» (PDF). Proceedings of the Ninth International Conference on Data Engineering.
- ^ Porter, Donald E.; Hofmann, Owen S.; Rossbach, Christopher J.; Benn, Alexander; Witchel, Emmett (October 2009). «Operating System Transactions» (PDF). Proceedings of the 22nd ACM Symposium on Operating Systems Principles (SOSP ’09). Big Sky, MT.
- ^ Gal, Eran; Toledo, Sivan. A Transactional Flash File System for Microcontrollers (PDF). USENIX 2005.
- ^ Troppens, Ulf; Erkens, Rainer; Müller, Wolfgang (2004). Storage Networks Explained: Basics and Application of Fibre Channel SAN, NAS, iSCSI and InfiniBand. John Wiley & Sons. pp. 124–125. ISBN 0-470-86182-7. Retrieved June 30, 2022.
- ^ «Mac OS X: About file system journaling». Apple. Retrieved 8 February 2014.
- ^ «Mac OS X 10.5 Leopard: Installing on a UFS-formatted volume». apple.com. 19 October 2007. Archived from the original on 16 March 2008. Retrieved 29 April 2016.
- ^ OSXDaily (2013-10-02). «How to Enable NTFS Write Support in Mac OS X». Retrieved 6 February 2014.
- ^ a b Steve Bunting (2012-08-14). EnCase Computer Forensics — The Official EnCE: EnCase Certified Examiner. ISBN 9781118219409. Retrieved 2014-02-07.
- ^ «File system formats available in Disk Utility on Mac». Apple Support.
- ^ «exFAT file system specification». Microsoft Docs.
- ^ The Prospero File System: A Global File System Based on the Virtual System Model. 1992.
- ^ Pirkola, G. C. (June 1975). «A file system for a general-purpose time-sharing environment». Proceedings of the IEEE. 63 (6): 918–924. doi:10.1109/PROC.1975.9856. ISSN 0018-9219. S2CID 12982770.
- ^ Pirkola, Gary C.; Sanguinetti, John. «The Protection of Information in a General Purpose Time-Sharing Environment». Proceedings of the IEEE Symposium on Trends and Applications 1977: Computer Security and Integrity. Vol. 10. pp. 106–114.
- ^ a b «How to Convert FAT Disks to NTFS». Microsoft Docs.
- ^ «Ext4 Howto». kernel.org. Retrieved 29 April 2016.
- ^ a b «Conversion from Ext3». Btrfs wiki.
Sources[edit]
- de Boyne Pollard, Jonathan (1996). «Disc and volume size limits». Frequently Given Answers. Retrieved February 9, 2005.
- «OS/2 corrective service fix JR09427». IBM. Retrieved February 9, 2005.
- «Attribute — $EA_INFORMATION (0xD0)». NTFS Information, Linux-NTFS Project. Retrieved February 9, 2005.
- «Attribute — $EA (0xE0)». NTFS Information, Linux-NTFS Project. Retrieved February 9, 2005.
- «Attribute — $STANDARD_INFORMATION (0x10)». NTFS Information, Linux-NTFS Project. Retrieved February 21, 2005.
- «Technical Note TN1150: HFS Plus Volume Format». Apple Inc. Retrieved September 22, 2015.
- Brian Carrier (2005). File System Forensic Analysis. Addison Wesley.
Further reading[edit]
Books[edit]
- Arpaci-Dusseau, Remzi H.; Arpaci-Dusseau, Andrea C. (2014). Operating Systems: Three Easy Pieces. Arpaci-Dusseau Books.
- Carrier, Brian (2005). File System Forensic Analysis. Addison-Wesley. ISBN 0-321-26817-2.
- Custer, Helen (1994). Inside the Windows NT File System. Microsoft Press. ISBN 1-55615-660-X.
- Giampaolo, Dominic (1999). Practical File System Design with the Be File System (PDF). Morgan Kaufmann Publishers. ISBN 1-55860-497-9. Archived (PDF) from the original on 2018-09-03. Retrieved 2019-12-15.
- McCoy, Kirby (1990). VMS File System Internals. VAX — VMS Series. Digital Press. ISBN 1-55558-056-4.
- Mitchell, Stan (1997). Inside the Windows 95 File System. O’Reilly. ISBN 1-56592-200-X.
- Nagar, Rajeev (1997). Windows NT File System Internals : A Developer’s Guide. O’Reilly. ISBN 978-1-56592-249-5.
- Pate, Steve D. (2003). UNIX Filesystems: Evolution, Design, and Implementation. Wiley. ISBN 0-471-16483-6.
- Rosenblum, Mendel (1994). The Design and Implementation of a Log-Structured File System. The Springer International Series in Engineering and Computer Science. Springer. ISBN 0-7923-9541-7.
- Russinovich, Mark; Solomon, David A.; Ionescu, Alex (2009). «File Systems». Windows Internals (5th ed.). Microsoft Press. ISBN 978-0-7356-2530-3.
- Prabhakaran, Vijayan (2006). IRON File Systems. PhD dissertation, University of Wisconsin-Madison.
- Silberschatz, Abraham; Galvin, Peter Baer; Gagne, Greg (2004). «Storage Management». Operating System Concepts (7th ed.). Wiley. ISBN 0-471-69466-5.
- Tanenbaum, Andrew S. (2007). Modern operating Systems (3rd ed.). Prentice Hall. ISBN 978-0-13-600663-3.
- Tanenbaum, Andrew S.; Woodhull, Albert S. (2006). Operating Systems: Design and Implementation (3rd ed.). Prentice Hall. ISBN 0-13-142938-8.
Online[edit]
- Benchmarking Filesystems (outdated) by Justin Piszcz, Linux Gazette 102, May 2004
- Benchmarking Filesystems Part II using kernel 2.6, by Justin Piszcz, Linux Gazette 122, January 2006
- Filesystems (ext3, ReiserFS, XFS, JFS) comparison on Debian Etch Archived 2008-09-13 at the Wayback Machine 2006
- Interview With the People Behind JFS, ReiserFS & XFS
- Journal File System Performance (outdated): ReiserFS, JFS, and Ext3FS show their merits on a fast RAID appliance
- Journaled Filesystem Benchmarks (outdated): A comparison of ReiserFS, XFS, JFS, ext3 & ext2
- Large List of File System Summaries (most recent update 2006-11-19)
- Linux File System Benchmarks v2.6 kernel with a stress on CPU usage
- «Linux 2.6 Filesystem Benchmarks (Older)». Archived from the original on 2016-04-15. Retrieved 2019-12-16.
{{cite web}}
: CS1 maint: unfit URL (link) - Linux large file support (outdated)
- Local Filesystems for Windows
- Overview of some filesystems (outdated)
- Sparse files support (outdated)
- Jeremy Reimer (March 16, 2008). «From BFS to ZFS: past, present, and future of file systems». arstechnica.com. Retrieved 2008-03-18.
External links[edit]
- «Filesystem Specifications — Links & Whitepapers». Archived from the original on 2015-11-03.
{{cite web}}
: CS1 maint: unfit URL (link) - Interesting File System Projects
It’s a bit tricky to explain what exactly a file system is in just one sentence.
That’s why I decided to write an article about it. This post is meant to be a high-level overview of file systems. But I’ll sneak into the lower-level concepts as well, as long as it doesn’t get boring.
Let’s start with a simple definition:
A file system defines how files are named, stored, and retrieved from a storage device.
Every time you open a file on your computer or smart device, your operating system uses its file system internally to load it from the storage device.
Or when you copy, edit, or delete a file, the file system handles it under the hood.
Whenever you download a file or access a web page over the Internet, a file system is involved too.
For instance, if you access a page on freeCodeCamp, your browser sends an HTTP request to freeCodeCamp’s server to fetch the page. If the requested resource is a file, it’s fetched from a file system.
When people talk about file systems, they might refer to different aspects of a file system depending on the context — that’s where things start to seem knotty.
And you might end up asking yourself, WHAT IS A FILE SYSTEM ANYWAY? 🤯
This guide helps you understand file systems in many contexts. I’ll cover partitioning and booting too!
To keep this guide manageable, I’ll concentrate on Unix-like environments when explaining the lower-level concepts or console commands.
However, these concepts remain relevant to other environments and file systems.
Why do we need a file system in the first place, you may ask?
Well, without a file system, the storage device would contain a big chunk of data stored back to back, and the operating system wouldn’t be able to tell them apart.
The term file system takes its name from the old paper-based data management systems, where we kept documents as files and put them into directories.
Imagine a room with piles of papers scattered all over the place.
A storage device without a file system would be in the same situation — and it would be a useless electronic device.
However, a file system changes everything:
A file system isn’t just a bookkeeping feature, though.
Space management, metadata, data encryption, file access control, and data integrity are the responsibilities of the file system too.
Everything begins with partitioning
Storage devices must be partitioned and formatted before the first use.
But what is partitioning?
Partitioning is splitting a storage device into several logical regions, so they can be managed separately as if they are separate storage devices.
We usually do partitioning by a disk management tool provided by operating systems, or as a command-line tool provided by the system’s firmware (I’ll explain what firmware is).
A storage device should have at least one partition or more if needed.
Why should we split the storage devices into multiple partitions anyways?
The reason is that we don’t want to manage the whole storage space as a single unit and for a single purpose.
It’s just like how we partition our workspace, to separate (and isolate) meeting rooms, conference rooms, and various teams.
For example, a basic Linux installation has three partitions: one partition dedicated to the operating system, one for the users’ files, and an optional swap partition.
A swap partition works as the RAM extension when RAM runs out of space.
For instance, the OS might move a chunk of data (temporarily) from RAM to the swap partition to free up some space on the RAM.
Operating systems continuously use various memory management techniques to ensure every process has enough memory space to run.
File systems on Windows and Mac have a similar layout, but they don’t use a dedicated swap partition; Instead, they manage to swap within the partition on which you’ve installed your operating system.
On a computer with multiple partitions, you can install several operating systems, and every time choose a different operating system to boot up your system with.
The recovery and diagnostic utilities reside in dedicated partitions too.
For instance, to boot up a MacBook in recovery mode, you need to hold Command + R
as soon as you restart (or turn on) your MacBook. By doing so, you instruct the system’s firmware to boot up with a partition that contains the recovery program.
Partitioning isn’t just a way of installing multiple operating systems and tools, though; It also helps us keep critical system files apart from ordinary ones.
So no matter how many games you install on your computer, it won’t have any effect on the operating system’s performance — since they reside in different partitions.
Back to the office example, having a call center and a tech team in a common area would harm both teams’ productivity because each team has its own requirements to be efficient.
For instance, the tech team would appreciate a quieter area.
Some operating systems, like Windows, assign a drive letter (A, B, C, or D) to the partitions. For instance, the primary partition on Windows (on which Windows is installed) is known as C:, or drive C.
In Unix-like operating systems, however, partitions appear as ordinary directories under the root directory — we’ll cover this later.
In the next section, we’ll dive deeper into partitioning and get to know two concepts that will change your perspective on file systems: system firmware and booting.
Are you ready?
Away we go! 🏊♂️
Partitioning schemes, system firmware, and booting
When partitioning a storage device, we have two partitioning methods (or schemes 🙄) to choose from:
- Master boot record (MBR) Scheme
- GUID Partition Table (GPT) Scheme
Regardless of what partitioning scheme you choose, the first few blocks on the storage device will always contain critical data about your partitions.
The system’s firmware uses these data structures to boot up the operating system on a partition.
Wait, what is the system firmware? You may ask.
Here’s an explanation:
A firmware is a low-level software embedded into electronic devices to operate the device, or bootstrap another program to do it.
Firmware exists in computers, peripherals (keyboards, mice, and printers), or even electronic home appliances.
In computers, the firmware provides a standard interface for complex software like an operating system to boot up and work with hardware components.
However, on simpler systems like a printer, the firmware is the operating system. The menu you use on your printer is the interface of its firmware.
Hardware manufacturers make firmware based on two specifications:
- Basic Input/Output (BIOS)
- Unified Extensible Firmware Interface (UEFI)
Firmwares — BIOS-based or UEFI-based — reside on a non-volatile memory, like a flash ROM attached to the motherboard.
When you press the power button on your computer, the firmware is the first program to run.
The mission of the firmware (among other things) is to boot up the computer, run the operating system, and pass it the control of the whole system.
A firmware also runs pre-OS environments (with network support), like recovery or diagnostic tools, or even a shell to run text-based commands.
The first few screens you see before your Windows logo appears are the output of your computer’s firmware, verifying the health of hardware components and the memory.
The initial check is confirmed with a beep (usually on PCs), indicating everything is good to go.
MBR partitioning and BIOS-based firmware
MBR partitioning scheme is a part of the BIOS specifications and is used by BIOS-based firmware.
On MBR-partitioned disks, the first sector on the storage device contains essential data to boot up the system.
This sector is called MBR.
MBR contains the following information:
- The boot loader, which is a simple program (in machine code) to initiate the first stage of the booting process
- A partition table, which contains information about your partitions.
BIOS-based firmware boots the system differently than UEFI-based firmware.
Here’s how it works:
Once the system is powered on, the BIOS firmware starts and loads the boot loader program (contained in MBR) onto memory. Once the program is on the memory, the CPU begins executing it.
Having the boot loader and the partition table in a predefined location like MBR enables BIOS to boot up the system without having to deal with any file.
If you are curious about how the CPU executes the instructions residing in the memory, you can read this beginner-friendly and fun guide on how the CPU works.
The boot loader code in the MBR takes between 434 bytes to 446 bytes of the MBR space (out of 512b). Additionally, 64 bytes are allocated to the partition table, which can contain information about a maximum of four partitions.
446 bytes isn’t big enough to accommodate too much code, though. That said, sophisticated boot loaders like GRUB 2 on Linux split their functionality into pieces or stages.
The smallest piece of code known as the first-stage boot loader is stored in the MBR. It’s usually a simple program, which doesn’t require much space.
The responsibility of the first-stage boot loader is to initiate the next (and more complicated) stages of the booting process.
Immediately after the MBR, and before the first partition starts, there’s a small space, around 1MB, called the MBR gap.
MBR gap can be used to place another piece of the boot loader program if needed.
A boot loader, such as GRUB 2, uses the MBR gap to store another stage of its functionality. GRUB calls this the stage 1.5 boot loader, which contains a file system driver.
Stage 1.5 enables the next stages of GRUB to understand the concept of files, rather than loading raw instructions from the storage device (like the first-stage boot loader).
The second stage boot loader, which is now capable of working with files, can load the operating system’s boot loader file to boot up the respective operating system.
This is when the operating system’s logo fades in…
Here’s the layout of an MBR-partition storage device:
And if we magnify the MBR, its content would look like this:
Although MBR is simple and widely supported, it has some limitations 😑.
MBR’s data structure limits the number of partitions to only four primary partitions.
A common workaround is to make an extended partition beside the primary partitions, as long as the total number of partitions won’t exceed four.
An extended partition can be split into multiple logical partitions. Making extended partitions is different across operating systems. Over this quick guide Microsoft explains how it should be done on Windows.
When making a partition, you can choose between primary and extended.
After this is solved, we’ll encounter the second limitation.
Each partition can be a maximum of 2TiB 🙄.
And wait, there’s more!
The content of the MBR sector has no backup 😱, meaning if MBR gets corrupted due to an unexpected reason, we’ll have to find a way to recycle that useless piece of hardware.
This is where GPT partitioning stands out 😎.
GPT partitioning and UEFI-based firmware
The GPT partitioning scheme is more sophisticated than MBR and doesn’t have the limitations of MBR.
For instance, you can have as many partitions as your operating system allows.
And every partition can be the size of the biggest storage device available in the market — actually a lot more.
GPT is gradually replacing MBR, although MBR is still widely supported across old PCs and new ones.
As mentioned earlier, GPT is a part of the UEFI specification, which is replacing the good old BIOS.
That means that UEFI-based firmware uses a GPT-partitioned storage device to handle the booting process.
Many hardware and operating systems now support UEFI and use the GPT scheme to partition storage devices.
In the GPT partitioning scheme, the first sector of the storage device is reserved for compatibility reasons with BIOS-based systems. The reason is some systems might still use a BIOS-based firmware but have a GPT-partitioned storage device.
This sector is called Protective MBR. (This is where the first-stage boot loader would reside in an MBR-partitioned disk)
After this first sector, the GPT data structures are stored, including the GPT header and the partition entries.
The GPT entries and the GPT header are backed up at the end of the storage device, so they can be recovered if the primary copy gets corrupted.
This backup is called Secondary GPT.
The layout of a GPT-partitioned storage device looks like this:
In GPT, all the booting services (boot loaders, boot managers, pre-os environments, and shells) live in a dedicated partition called EFI System Partition (ESP), which UEFI firmware can use.
ESP even has its own file system, which is a specific version of FAT. On Linux, ESP resides under the /sys/firmware/efi
path.
If this path cannot be found on your system, then your firmware is probably BIOS-based firmware.
To check it out, you can try to change the directory to the ESP mount point, like so:
cd /sys/firmware/efi
UEFI-based firmware assumes that the storage device is partitioned with GPT and looks up the ESP in the GPT partition table.
Once the EFI partition is found, it looks for the configured boot loader — usually, a file ending with .efi
.
UEFI-based firmware gets the booting configuration from NVRAM (a non-volatile RAM).
NVRAM contains the booting settings and paths to the operating system boot loader files.
UEFI firmware can do a BIOS-style boot too (to boot the system from an MBR disk) if configured accordingly.
You can use the parted
command on Linux to see what partitioning scheme is used for a storage device.
sudo parted -l
And the output would be something like this:
Model: Virtio Block Device (virtblk)
Disk /dev/vda: 172GB
Sector size (logical/physical): 512B/512B
Partition Table: gpt
Disk Flags:
Number Start End Size File system Name Flags
14 1049kB 5243kB 4194kB bios_grub
15 5243kB 116MB 111MB fat32 msftdata
1 116MB 172GB 172GB ext4
Based on the above output, the storage device’s ID is /dev/vda
with a capacity of 172GB. The storage device is partitioned based on GPT and has three partitions; The second and third partitions are formatted based on the FAT32 and EXT4 file systems respectively.
Having a BIOS GRUB partition implies the firmware is still BIOS-based firmware.
Let’s confirm that with the dmidecode
command like so:
sudo dmidecode -t 0
And the output would be:
# dmidecode 3.2
Getting SMBIOS data from sysfs.
SMBIOS 2.4 present.
...
✅ Confirmed!
Formatting partitions
When partitioning is done, the partitions should be formatted.
Most operating systems allow you to format a partition based on a set of file systems.
For instance, if you are formatting a partition on Windows, you can choose between FAT32, NTFS, and exFAT file systems.
Formatting involves the creation of various data structures and metadata used to manage files within a partition.
These data structures are one aspect of a file system.
Let’s take the NTFS file system as an example.
When you format a partition to NTFS, the formatting process places the key NTFS data structures and the Master file table (MFT) on the partition.
Alright, let’s get back file systems with our new background about partitioning, formatting, and booting.
How it started, how it’s going
A file system is a set of data structures, interfaces, abstractions, and APIs that work together to manage any type of file on any type of storage device, in a consistent manner.
Each operating system uses a particular file system to manage the files.
In the early days, Microsoft used FAT (FAT12, FAT16, and FAT32) in the MS-DOS and Windows 9x family.
Starting from Windows NT 3.1, Microsoft developed New Technology File System (NTFS), which had many advantages over FAT32, such as supporting bigger files, allowing longer filenames, data encryption, access management, journaling, and a lot more.
NTFS has been the default file system of the Window NT family (2000, XP, Vista, 7, 10, etc.) ever since.
NTFS isn’t suitable for non-Windows environments, though 🤷🏻.
For instance, you can only read the content of an NTFS-formatted storage device (like flash memory) on a Mac OS, but you won’t be able to write anything to it — unless you install an NTFS driver with write support.
Or you can just use the exFat file system.
Extended File Allocation Table (exFAT) is a lighter version of NTFS created by Microsoft in 2006.
exFAT was designed for high-capacity removable devices, such as external hard disks, USB drives, and memory cards.
exFAT is the default file system used by SDXC cards.
Unlike NTFS, exFAT has read and write support on Non-Windows environments as well, including Mac OS — making it the best cross-platform file system for high-capacity removable storage devices.
So basically, if you have a removable disk you want to use on Windows, Mac, and Linux, you need to format it to exFAT.
Apple has also developed and used various file systems over the years, including
Hierarchical File System (HFS), HFS+, and recently Apple File System (APFS).
Just like NTFS, APFS is a journaling file system and has been in use since the launch of OS X High Sierra in 2017.
But how about file systems in Linux distributions?
The Extended File System (ext) family of file systems was created for the Linux kernel — the core of the Linux operating system.
The first version of ext was released in 1991, but soon after, it was replaced by the second extended file system (ext2) in 1993.
In the 2000s, the third extended filesystem (ext3) and fourth extended filesystem (ext4) were developed for Linux with journaling capability.
ext4 is now the default file system in many distributions of Linux, including Debian and Ubuntu.
You can use the findmnt
command on Linux to list your ext4-formatted partitions:
findmnt -lo source,target,fstype,used -t ext4
The output would be something like:
SOURCE TARGET FSTYPE USED
/dev/vda1 / ext4 3.6G
Architecture of file systems
A file system installed on an operating system consists of three layers:
- Physical file system
- Virtual file system
- Logical file system
These layers can be implemented as independent or tightly coupled abstractions.
When people talk about file systems, they refer to one of these layers or all three as one unit.
Although these layers are different across operating systems, the concept is the same.
The physical layer is the concrete implementation of a file system; It’s responsible for data storage and retrieval and space management on the storage device (or precisely: partitions).
The physical file system interacts with the storage hardware via device drivers.
The next layer is the virtual file system or VFS.
The virtual file system provides a consistent view of various file systems mounted on the same operating system.
So does this mean an operating system can use multiple file systems at the same time?
The answer is yes!
It’s common for a removable storage medium to have a different file system than that of a computer.
For instance, on Windows (which uses NTFS as the primary file system), a flash memory might have been formatted to exFAT or FAT32.
That said, the operating system should provide a unified interface between computer programs (file explorers and other apps that work with files) and the different mounted file systems (such as NTFS, APFS, ext4, FAT32, exFAT, and UDF).
For instance, when you open up your file explorer program, you can copy an image from an ext4 file system and paste it over to your exFAT-formatted flash memory — without having to know that files are managed differently under the hood.
This convenient layer between the user (you) and the underlying file systems is provided by the VFS.
A VFS defines a contract that all physical file systems must implement to be supported by that operating system.
However, this compliance isn’t built into the file system core, meaning the source code of a file system doesn’t include support for every operating system’s VFS.
Instead, it uses a file system driver to adhere to the VFS rules of every file system. A driver is a program that enables software to communicate with another software or hardware.
Although VFS is responsible for providing a standard interface between programs and various file systems, computer programs don’t interact with VFS directly.
Instead, they use a unified API between programs and the VFS.
Can you guess what it is?
Yes, we’re talking about the logical file system.
The logical file system is the user-facing part of a file system, which provides an API to enable user programs to perform various file operations, such as OPEN
, READ
, and WRITE
, without having to deal with any storage hardware.
On the other hand, VFS provides a bridge between the logical layer (which programs interact with) and a set of the physical layer of various file systems.
What does it mean to mount a file system?
On Unix-like systems, the VFS assigns a device ID (for instance, dev/disk1s1
) to each partition or removable storage device.
Then, it creates a virtual directory tree and puts the content of each device under that directory tree as separate directories.
The act of assigning a directory to a storage device (under the root directory tree) is called mounting, and the assigned directory is called a mount point.
That said, on a Unix-like operating system, all partitions and removable storage devices appear as if they are directories under the root directory.
For instance, on Linux, the mounting points for a removable device (such as a memory card), are usually under the /media
directory.
That said, once a flash memory is attached to the system, and consequently, auto mounted at the default mounting point (/media
in this case), its content would be available under the /media
directory.
However, there are times you need to mount a file system manually.
On Linux, it’s done like so:
mount /dev/disk1s1 /media/usb
In the above command, the first parameter is the device ID (/dev/disk1s1
), and the second parameter (/media/usb
) is the mount point.
Please note that the mount point should already exist as a directory.
If it doesn’t, it has to be created first:
mkdir -p /media/usb
mount /dev/disk1s1 /media/usb
If the mount-point directory already contains files, those files will be hidden for as long as the device is mounted.
File metadata is a data structure that contains data about a file, such as:
- File size
- Timestamps, like creation date, last accessed date, and modification date
- The file’s owner
- The file’s mode (who can do what with the file)
- What blocks on the partition are allocated to the file
- and a lot more
Metadata isn’t stored with the file content, though. Instead, it’s stored in a different place on the disk — but associated with the file.
In Unix-like systems, the metadata is in the form of data structures, called inode.
Inodes are identified by a unique number called the inode number.
Inodes are associated with files in a table called inode tables.
Each file on the storage device has an inode, which contains information about it such as the time it was created, modified, etc.
The inode also includes the address of the blocks allocated to the file; On the other hand, where exactly it’s located on the storage device
In an ext4 inode, the address of the allocated blocks is stored as a set of data structures called extents (within the inode).
Each extent contains the address of the first data block allocated to the file and the number of the continuous blocks that the file has occupied.
Whenever you open a file on Linux, its name is first resolved to an inode number.
Having the inode number, the file system fetches the respective inode from the inode table.
Once the inode is fetched, the file system starts to compose the file from the data blocks registered in the inode.
You can use the df
command with the -i
parameter on Linux to see the inodes (total, used, and free) in your partitions:
df -i
The output would look like this:
udev 4116100 378 4115722 1% /dev
tmpfs 4118422 528 4117894 1% /run
/dev/vda1 6451200 175101 6276099 3% /
As you can see, the partition /dev/vda1
has a total number of 6,451,200 inodes, of which 3% have been used (175,101 inodes).
To see the inodes associated with files in a directory, you can use the ls
command with -il
parameters.
ls -li
And the output would be:
1303834 -rw-r--r-- 1 root www-data 2502 Jul 8 2019 wp-links-opml.php
1303835 -rw-r--r-- 1 root www-data 3306 Jul 8 2019 wp-load.php
1303836 -rw-r--r-- 1 root www-data 39551 Jul 8 2019 wp-login.php
1303837 -rw-r--r-- 1 root www-data 8403 Jul 8 2019 wp-mail.php
1303838 -rw-r--r-- 1 root www-data 18962 Jul 8 2019 wp-settings.php
The first column is the inode number associated with each file.
The number of inodes on a partition is decided when you format a partition. That said, as long as you have free space and unused inodes, you can store files on your storage device.
It’s unlikely that a personal Linux OS would run out of inodes. However, enterprise services that deal with a large number of files (like mail servers) have to manage their inode quota smartly.
On NTFS, the metadata is stored differently, though.
NTFS keeps file information in a data structure called the Master File Table (MFT).
Every file has at least one entry in MFT, which contains everything about it, including its location on the storage device — similar to the inodes table.
On most operating systems, you can grab metadata via the graphical user interface.
For instance, when you right-click on a file on Mac OS, and select Get Info (Properties in Windows), a window appears with information about the file. This information is fetched from the respective file’s metadata.
Space Management
Storage devices are divided into fixed-sized blocks called sectors.
A sector is the minimum storage unit on a storage device and is between 512 bytes and 4096 bytes (Advanced Format).
However, file systems use a high-level concept as the storage unit, called blocks.
Blocks are an abstraction over physical sectors; Each block usually consists of multiple sectors.
Depending on the file size, the file system allocates one or more blocks to each file.
Speaking of space management, the file system is aware of every used and unused block on the partitions, so it’ll be able to allocate space to new files or fetch the existing ones when requested.
The most basic storage unit in ext4-formatted partitions is the block. However, the contiguous blocks are grouped into block groups for easier management.
Each block group has its own data structures and data blocks.
Here are the data structures a block group can contain:
- Super Block: a metadata repository, which contains metadata about the entire file system, such as the total number of blocks in the file system, total blocks in block groups, inodes, and more. Not all block groups contain the superblock, though. A certain number of block groups store a copy of the super as a backup.
- Group Descriptors: Group descriptors also contain bookkeeping information for each block group
- Inode Bitmap: Each block group has its own inode quota for storing files. A block bitmap is a data structure used to identify used and unused inodes within the block group.
1
denotes used and0
denotes unused inode objects. - Block Bitmap: a data structure used to identify used & unused data blocks within the block group.
1
denotes used and0
denotes unused data blocks - Inode Table: a data structure that defines the relation of files and their inodes. The number of inodes stored in this area is related to the block size used by the file system.
- Data Blocks: This is the zone within the block group where file contents are stored.
Ext4 file system even takes one step further (comparing to ext3), and organizes block groups into a bigger group called flex block groups.
The data structures of each block group, including the block bitmap, inode bitmap, and inode table, are concatenated and stored in the first block group within each flex block group.
Having all the data structures concatenated in one block group (the first one) frees up more contiguous data blocks on other block groups within each flex block group.
These concepts might be confusing, but you don’t have to master every bit of them. It’s just to depict the depth of file systems.
The layout of the first block group looks like this:
When a file is being written to a disk, it is written to one or more blocks within a block group.
Managing files at the block group level improves the performance of the file system significantly, as opposed to organizing files as one unit.
Size vs size on disk
Have you ever noticed that your file explorer displays two different sizes for each file: size, and size on disk.
Why are size
and size on disk
slightly different? You may ask.
Here’s an explanation:
We already know depending on the file size, one or more blocks are allocated to a file.
One block is the minimum space that can be allocated to a file. This means the remaining space of a partially-filled block cannot be used by another file. This is the rule!
Since the size of the file isn’t an integer multiple of blocks, the last block might be partially used, and the remaining space would remain unused — or would be filled with zeros.
So «size» is basically the actual file size, while «size on disk» is the space it has occupied, even though it’s not using it all.
You can use the du
command on Linux to see it yourself.
du -b "some-file.txt"
The output would be something like this:
623 icon-link.svg
And to check the size on disk:
du -B 1 "icon-link.svg"
Which will result in:
4096 icon-link.svg
Based on the output, the allocated block is about 4kb, while the actual file size is 623 bytes. This means each block size on this operating system is 4kb.
What is disk fragmentation?
Over time, new files are written to the disk, existing files get bigger, shrunk, or deleted.
These frequent changes in the storage medium leave many small gaps (empty spaces) between files. These gaps are due to the same reason file size and file size on disk are different. Some files won’t fill up the full block, and lots of space will be wasted. And over time there’ won’t be enough consequent blocks to store new files.
That’s when new files need to be stored as fragments.
File Fragmentation occurs when a file is stored as fragments on the storage device because the file system cannot find enough contiguous blocks to store the whole file in a row.
Let’s make it more clear with an example.
Imagine you have a Word document named myfile.docx
.
myfile.docx
is initially stored in a few contiguous blocks on the disk; Let’s say this is how the blocks are named: LBA250
, LBA251
, and LBA252
.
Now, if you add more content to myfile.docx
and save it, it will need to occupy more blocks on the storage medium.
Since myfile.docx
is currently stored on LBA250
, LBA251
, and LBA252
, the new content should preferably sit within LBA253
and so forth — depending on how many more blocks are needed to accommodate the new changes.
Now, imagine LBA253
is already taken by another file (maybe it’s the first block of another file). In that case, the new content of myfile.docx
has to be stored on different blocks somewhere else on the disks, for instance, LBA312
and LBA313
.
myfile.docx
got fragmented 💔.
File fragmentation puts a burden on the file system because every time a fragmented file is requested by a user program, the file system needs to collect every piece of the file from various locations on a disk.
This overhead applies to saving the file back to the disk as well.
The fragmentation might also occur when a file is written to the disk for the first time, probably because the file is huge and not many continuous blocks are left on the partition.
Fragmentation is one of the reasons some operating systems get slow as the file system ages.
Should We Care About Fragmentation these days?
The short answer is: not anymore!
Modern file systems use smart algorithms to avoid (or early-detect) fragmentation as much as possible.
Ext4 also does some sort of preallocation, which involves reserving blocks for a file before they are actually needed — making sure the file won’t get fragmented if it gets bigger over time.
The number of the preallocated blocks is defined in the length field of the file’s extent of its inode object.
Additionally, ext4 uses an allocation technique called delayed allocation.
The idea is instead of writing to data blocks one at a time during a write, the allocation requests are accumulated in a buffer and are written to the disk at once.
Not having to call the file system’s block allocator on every write request helps the file system make better choices with distributing the available space. For instance, by placing large files apart from smaller files.
Imagine that a small file is located between two large files. Now, if the small file is deleted, it leaves a small space between the two files.
Spreading the files out in this manner leaves enough gaps between data blocks, which helps the filesystem manage (and avoid) fragmentation more easily.
Delayed allocation actively reduces fragmentation and increases performance.
Directories
A Directory (Folder in Windows) is a special file used as a logical container to group files and directories within a file system.
On NTFS and Ext4, directories and files are treated the same way. That said, directories are just files that have their own inode (on Ext4) or MFT entry (on NTFS).
The inode or MFT entry of a directory contains information about that directory, as well as a collection of entries pointing to the files «under» that directory.
The files aren’t literally contained within the directory, but they are associated with the directory in a way that they appear as directory’s children at a higher level, such as in a file explorer program.
These entries are called directory entries. Directory entries contain file names mapped to their inode/MFT entry.
In addition to the directory entries, there are two more entries. The .
entry, which points to the directory itself, and ..
, which points to the parent directory of this directory.
On Linux, you can use the ls
in a directory to see the directory entries with their associated inode numbers:
ls -lai
And the output would be something like this:
63756 drwxr-xr-x 14 root root 4096 Dec 1 17:24 .
2 drwxr-xr-x 19 root root 4096 Dec 1 17:06 ..
81132 drwxr-xr-x 2 root root 4096 Feb 18 06:25 backups
81020 drwxr-xr-x 14 root root 4096 Dec 2 07:01 cache
81146 drwxrwxrwt 2 root root 4096 Oct 16 21:43 crash
80913 drwxr-xr-x 46 root root 4096 Dec 1 22:14 lib
...
Rules for naming files
Some file systems enforce limitations on filenames.
The limitation can be in the length of the filename or filename case sensitivity.
For instance, in NTFS (Windows) and APFS (Mac) file systems, MyFile
and myfile
refer to the same file, while on ext4 (Linux), they point to different files.
Why does this matter? You may ask.
Imagine that you’re creating a web page on your Windows machine. The web page contains your company logo, which is a PNG file, like this:
<!DOCTYPE html>
<html>
<head>
<title>Products - Your Website</title>
</head>
<body>
<!--SOME CONTENT-->
<img src="img/logo.png">
<!--SOME MORE CONTENT-->
</body>
</html>
If the actual file name is Logo.png
(note the capital L), you can still see the image when you open your web page on your web browser (on your Windows machine).
However, once you deploy it to a Linux server and view it live, you’ll see a broken image.
Why?
Because in Linux (ext4 file system) logo.png
and Logo.png
point to two different files.
So keep that in mind when developing on Windows and deploying to a Linux server.
Rules for file size
One important aspect of file systems is the maximum file size they support.
An old file system like FAT32 (used by MS-DOS +7.1, Windows 9x family, and flash memories) can’t store files more than 4 GB, while its successor, NTFS allows file sizes to be up to 16 EB (1000 TB).
Like NTFS, exFAT allows a file size of 16 EB too. This makes exFAT an ideal option for storing massive data objects, such as video files.
Practically, there’s no limitation on the file size in the exFAT and NTFS file systems.
Linux’s ext4 and Apple’s APFS support files up to 16 TiB and 8 EiB respectively.
File manager programs
As you know, the logical layer of the file system provides an API to enable user applications to perform file operations, such as read
, write
, delete
, and execute
operations.
The file system’s API is a low-level mechanism, though, designed for computer programs, runtime environments, and shells — not designed for daily use.
That said, operating systems provide convenient file management utilities out of the box for your day-to-day file management.
For instance, File Explorer on Windows, Finder on Mac OS, and Nautilus on Ubuntu are examples of file manager programs.
These utilities use the logical file system’s API under the hood.
Apart from these GUI tools, operating systems expose the file system’s APIs via the command-line interfaces too, like Command Prompt on Windows, and Terminal on Mac and Linux.
These text-based interfaces help users do all sorts of file operations as text commands — Like how we did in the previous examples.
File access management
Not everyone should be able to remove or modify a file they don’t own or are not authorized to do so.
Modern file systems provide mechanisms to control users’ access and capabilities concerning files.
The data regarding user permissions and file ownership is stored in a data structure called Access-Control List (ACL) on Windows or Access-Control Entries (ACE) on Unix-like operating systems (Linux and Mac OS).
This feature is also available in the CLI (Command prompt or Terminal), where a user can change file ownerships or limit permissions of each file right from the command line interface.
For instance, a file owner (on Linux or Mac) can configure a file to be available to the public, like so:
chmod 777 myfile.txt
777
means everyone can do every operation (read, write, execute) on myfile.txt
. Please note this is just an example, and you should not set a file’s permission to 777
.
Maintaining data integrity
Let’s suppose you’ve been working on your thesis for a month now. One day, you open the file, make some changes and save it.
Once you save the file, your word processor program sends a “write” request to the file system’s API (the logical file system).
The request is eventually passed down to the physical layer to store the file on several blocks.
But what if the system crashes while the older version of the file is being replaced with the new version?
In older file systems (like FAT32 or ext2) the data would be corrupted because it was partially written to the disk.
This is less likely to happen with modern file systems as they use a technique called journaling.
Journaling file systems record every operation that’s about to happen in the physical layer but hasn’t happened yet.
The main purpose is to keep track of the changes that haven’t yet been committed to the file system physically.
The journal is a special allocation on the disk where each writing attempt is first stored as a transaction.
Once the data is physically placed on the storage device, the change is committed to the filesystem.
In case of a system failure, the file system will detect the incomplete transaction and roll it back as if it never happened.
That said, the new content (that was being written) may still be lost, but the existing data would remain intact.
Modern file systems such as NTFS, APFS, and ext4 (even ext3) use journaling to avoid data corruption in case of system failure.
Database File Systems
Typical file systems organize files as directory trees.
To access a file, you traverse to the respective directory, and you’ll have it.
cd /music/country/highwayman
However, in a database file system, there’s no concept of paths and directories.
The database file system is a faceted system which groups files based on various attributes and dimensions.
For instance, MP3 files can be listed by artist, genre, release year, and album — at the same time!
A database file system is more like a high-level application to help you organize and access your files more easily and more efficiently. However, you won’t be able to access the raw files outside of this application.
A database file system cannot replace a typical file system, though. It’s just a high-level abstraction for easier file management on some systems.
The iTunes app on Mac OS is a good example of a database file system.
Wrapping Up
Wow! You made it to the end, which means you know a lot more about file systems now. But I’m sure this won’t be the end of your file system studies.
So again — can we describe what a file system is and how it works in one sentence?
We can’t! 😁
But let’s finish this post with the brief description I used at the beginning:
A file system defines how files are named, stored, and retrieved from the storage device.
Alright, I think it does it for this write-up. If you notice something is missing or that I’ve gotten wrong, please let me in the comments below. That would help me and others too!
By the way, if you like more comprehensive guides like this one, visit my website decodingweb. dev and follow me on Twitter because, besides freeCodeCamp, those are the channels I use to share my everyday findings.
Thanks for reading, and enjoy learning! 😃
Learn to code for free. freeCodeCamp’s open source curriculum has helped more than 40,000 people get jobs as developers. Get started
The big difference between Linux and Windows, at least when it comes to their filesystems and directory trees is that in Linux «everything is a file», and everything descends from a single root. This also applies to almost all Unix-derived OSes such as BSD, OS X, Solaris, etc., but I’m going to just say «Linux» to be generic (if not entirely accurate).
But what does that mean in practice?
Windows allows for multiple named roots for their filesystems. You understand these as drive letters: C: D: E:
and so on. Each one has a root (), and a tree that descends from it. Recent versions of Windows allow for things like volume mountpoints, where a volume (what you’d consider a partition) can be mounted to an existing, empty folder. So instead of
D:
representing the root of, say, your optical (CD/DVD/BR) drive, you could mount it at C:Optical
instead. This is more similar to what Linux does. There’s also an underlying, single-rooted, object namespace for everything in Windows similar to what Linux uses and is managed by the Object Manager, but most users rarely see it referenced since it’s primarily for kernel use.
Linux has a single root: /
. Everything descends from it, and it doesn’t necessarily need to represent your hard drive. Hard Drives, Optical Drives, Memory Cards, Network Shares, Printers, Scanners, CPUs, RAM, Processes, … everything is represented somewhere inside this single namespace, and can be access by any process with standard file management APIs, presuming you have a high enough level of access. Just because you can read or write from it doesn’t mean it’s a file on your hard drive in Linux. For example, devices are typically mounted into /dev
, so accessing things in there often means you’re talking to a device — maybe it’s the sound card, or a scanner, or a camera, etc. These are known as device files. Procfs is a special «filesystem» that’s normally mounted to /proc
and has a «directory» for every running process, with files in each directory relating to things like the command line used to invoke that process, memory maps, open files, etc. Sysfs is another special filesystem (mounted on /sys
) used to expose a wealth of information about the running kernel objects and can also be used to fine-tune the running kernel by simply writing to a particular file.
In a computer, a file system — sometimes written filesystem — is the way in which files are named and where they are placed logically for storage and retrieval. Without a file system, stored information wouldn’t be isolated into individual files and would be difficult to identify and retrieve. As data capacities increase, the organization and accessibility of individual files are becoming even more important in data storage.
Digital file systems and files are named for and modeled after paper-based filing systems using the same logic-based method of storing and retrieving documents.
File systems can differ between operating systems (OS), such as Microsoft Windows, macOS and Linux-based systems. Some file systems are designed for specific applications. Major types of file systems include distributed file systems, disk-based file systems and special purpose file systems.
How file systems work
A file system stores and organizes data and can be thought of as a type of index for all the data contained in a storage device. These devices can include hard drives, optical drives and flash drives.
File systems specify conventions for naming files, including the maximum number of characters in a name, which characters can be used and, in some systems, how long the file name suffix can be. In many file systems, file names are not case sensitive.
Along with the file itself, file systems contain information such as the size of the file, as well as its attributes, location and hierarchy in the directory in the metadata. Metadata can also identify free blocks of available storage on the drive and how much space is available.
A file system also includes a format to specify the path to a file through the structure of directories. A file is placed in a directory — or a folder in Windows OS — or subdirectory at the desired place in the tree structure. PC and mobile OSes have file systems in which files are placed somewhere in a hierarchical tree structure.
Before files and directories are created on the storage medium, partitions should be put into place. A partition is a region of the hard disk or other storage that the OS manages separately. One file system is contained in the primary partition, and some OSes allow for multiple partitions on one disk. In this situation, if one file system gets corrupted, the data in a different partition will be safe.
File systems and the role of metadata
File systems use metadata to store and retrieve files. Examples of metadata tags include:
- Date created
- Date modified
- Last date of access
- Last backup
- User ID of the file creator
- Access permissions
- File size
Metadata is stored separately from the contents of the file, with many file systems storing the file names in separate directory entries. Some metadata may be kept in the directory, whereas other metadata may be kept in a structure called an inode.
In Unix-like operating systems, an inode can store metadata unrelated to the content of the file itself. The inode indexes information by number, which can be used to access the location of the file and then the file itself.
An example of a file system that capitalizes on metadata is OS X, the OS used by Apple. It allows for a number of optimization features, including file names that can stretch to 255 characters.
File system access
File systems can also restrict read and write access to a particular group of users. Passwords are the easiest way to do this. Along with controlling who can modify or read files, restricting access can ensure that data modification is controlled and limited.
File permissions such as access or capability control lists can also be used to moderate file system access. These types of mechanisms are useful to prevent access by regular users, but not as effective against outside intruders.
Encrypting files can also prevent user access, but it is focused more on protecting systems from outside attacks. An encryption key can be applied to unencrypted text to encrypt it, or the key can be used to decrypt encrypted text. Only users with the key can access the file. With encryption, the file system does not need to know the encryption key to manage the data effectively.
Types of file systems
There are a number of types of file systems, all with different logical structures and properties, such as speed and size. The type of file system can differ by OS and the needs of that OS. The three most common PC operating systems are Microsoft Windows, Mac OS X and Linux. Mobile OSes include Apple iOS and Google Android.
Major file systems include the following:
File allocation table (FAT) is supported by the Microsoft Windows OS. FAT is considered simple and reliable, and it is modeled after legacy file systems. FAT was designed in 1977 for floppy disks, but was later adapted for hard disks. While efficient and compatible with most current OSes, FAT cannot match the performance and scalability of more modern file systems.
Global file system (GFS) is a file system for the Linux OS, and it is a shared disk file system. GFS offers direct access to shared block storage and can be used as a local file system.
GFS2 is an updated version with features not included in the original GFS, such as an updated metadata system. Under the terms of the GNU General Public License, both the GFS and GFS2 file systems are available as free software.
Hierarchical file system (HFS) was developed for use with Mac operating systems. HFS can also be referred to as Mac OS Standard, and it was succeeded by Mac OS Extended. Originally introduced in 1985 for floppy and hard disks, HFS replaced the original Macintosh file system. It can also be used on CD-ROMs.
The NT file system — also known as the New Technology File System (NTFS) — is the default file system for Windows products from Windows NT 3.1 OS onward. Improvements from the previous FAT file system include better metadata support, performance and use of disk space. NTFS is also supported in the Linux OS through a free, open-source NTFS driver. Mac OSes have read-only support for NTFS.
Universal Disk Format (UDF) is a vendor-neutral file system used on optical media and DVDs. UDF replaces the ISO 9660 file system and is the official file system for DVD video and audio as chosen by the DVD Forum.
File system vs. DBMS
Like a file system, a database management system (DBMS) efficiently stores data that can be updated and retrieved. The two are not interchangeable, however. While a file system stores unstructured, often unrelated files, a DBMS is used to store and manage structured, related data.
A DBMS creates and defines the restraints for a database. A file system allows access to single files at a time and addresses each file individually. Because of this, functions such as redundancy are performed on an individual level, not by the file system itself. This makes a file system a much less consistent form of data storage than a DBMS, which maintains one repository of data that is defined once.
The centralized structure of a DBMS allows for easier file sharing than a file system and prevents anomalies that can occur when separate changes are made to files in a file system.
There are methods to protect files in a file system, but for heavy-duty security, a DBMS is the way to go. Security in a file system is determined by the OS, and it can be difficult to maintain over time as files are accessed and authorization is granted to users.
A DBMS keeps security constraints high, relying on password protection, encryption and limited authorization. More security does result in more obstacles when retrieving data, so in terms of general, simple-to-use file storage and retrieval, a file system may be preferred.
File systems definition evolves
While previously referring to physical, paper files, the term file system was used to refer to digital files as early as 1961. By 1964, it had entered general use to refer to computerized file systems.
The term file system can also refer to the part of an OS or an add-on program that supports a file system. Examples of such add-on file systems include the Network File System (NFS) and the Andrew File System (AFS).
In addition, the term has evolved to refer to the hardware used for nonvolatile storage, the software application that controls the hardware and architecture of both hardware and software.
A file system (often also written as filesystem) is a method of storing and organizing computer files and their data. Essentially, it organizes these files into a database for the storage, organization, manipulation, and retrieval by the computer’s operating system.
File systems are used on data storage devices such as a hard disks or CD-ROMs to maintain the physical location of the files. Beyond this, they might provide access to data on a file server by acting as clients for a network protocol (e.g., NFS, SMB, or 9P clients), or they may be virtual and exist only as an access method for virtual data (e.g., procfs). It is distinguished from a directory service and registry.
Aspects of file systems
Most file systems make use of an underlying data storage device that offers access to an array of fixed-size physical sectors, generally a power of 2 in size (512 bytes or 1, 2, or 4 KiB are most common). The file system is responsible for organizing these sectors into files and directories, and keeping track of which sectors belong to which file and which are not being used. Most file systems address data in fixed-sized units called «clusters» or «blocks» which contain a certain number of disk sectors (usually 1-64). This is the smallest amount of disk space that can be allocated to hold a file. However, file systems need not make use of a storage device at all. A file system can be used to organize and represent access to any data, whether it is stored or dynamically generated (e.g., procfs).
File names
A file name is a name assigned to a file in order to secure storage location in the computer memory. By this file name a file can be further accessed. Whether the file system has an underlying storage device or not, file systems typically have directories which associate file names with files, usually by connecting the file name to an index in a file allocation table of some sort, such as the FAT in a DOS file system, or an inode in a Unix-like file system. Directory structures may be flat, or allow hierarchies where directories may contain subdirectories. In some file systems, file names are structured, with special syntax for filename extensions and version numbers. In others, file names are simple strings, and per-file metadata is stored elsewhere.
Metadata
Other bookkeeping information is typically associated with each file within a file system. The length of the data contained in a file may be stored as the number of blocks allocated for the file or as an exact byte count. The time that the file was last modified may be stored as the file’s timestamp. Some file systems also store the file creation time, the time it was last accessed, and the time that the file’s meta-data was changed. (Note that many early PC operating systems did not keep track of file times.) Other information can include the file’s device type (e.g., block, character, socket, subdirectory, etc.), its owner user-ID and group-ID, and its access permission settings (e.g., whether the file is read-only, executable, etc.).
Arbitrary attributes can be associated on advanced file systems, such as NTFS, XFS, ext2/ext3, some versions of UFS, and HFS+, using extended file attributes. This feature is implemented in the kernels of Linux, FreeBSD and Mac OS X operating systems, and allows metadata to be associated with the file at the file system level. This, for example, could be the author of a document, the character encoding of a plain-text document, or a checksum.
Hierarchical file systems
The hierarchical file system (not to be confused with Apple’s HFS) was an early research interest of Dennis Ritchie of Unix fame; previous implementations were restricted to only a few levels, notably the IBM implementations, even of their early databases like IMS. After the success of Unix, Ritchie extended the file system concept to every object in his later operating system developments, such as Plan 9 and Inferno.
Facilities
Traditional file systems offer facilities to create, move and delete both files and directories. They lack facilities to create additional links to a directory (hard links in Unix), rename parent links («..» in Unix-like OS), and create bidirectional links to files.
Traditional file systems also offer facilities to truncate, append to, create, move, delete and in-place modify files. They do not offer facilities to prepend to or truncate from the beginning of a file, let alone arbitrary insertion into or deletion from a file. The operations provided are highly asymmetric and lack the generality to be useful in unexpected contexts. For example, interprocess pipes in Unix have to be implemented outside of the file system because the pipes concept does not offer truncation from the beginning of files.
Secure access
-
See also: Secure computing
Secure access to basic file system operations can be based on a scheme of access control lists or capabilities. Research has shown access control lists to be difficult to secure properly, which is why research operating systems tend to use capabilities.[citation needed] Commercial file systems still use access control lists.
Types of file systems
File system types can be classified into disk file systems, network file systems and special purpose file systems.
Disk file systems
A disk file system is a file system designed for the storage of files on a data storage device, most commonly a disk drive, which might be directly or indirectly connected to the computer. Examples of disk file systems include FAT (FAT12, FAT16, FAT32, exFAT), NTFS, HFS and HFS+, HPFS, UFS, ext2, ext3, ext4, btrfs, ISO 9660, ODS-5, Veritas File System, ZFS, ReiserFS, Linux SWAP and UDF.
Some disk file systems are journaling file systems or versioning file systems.
ISO 9660 and Universal Disk Format are the two most common formats that target Compact Discs and DVDs. Mount Rainier is a newer extension to UDF supported by Linux 2.6 series and Windows Vista that facilitates rewriting to DVDs in the same fashion as has been possible with floppy disks.
Flash file systems
-
Main article: Flash file system
A flash file system is a file system designed for storing files on flash memory devices. These are becoming more prevalent as the number of mobile devices is increasing, and the capacity of flash memories increase.
While a disk file system can be used on a flash device, this is suboptimal for several reasons:
- Erasing blocks: Flash memory blocks have to be explicitly erased before they can be rewritten. The time taken to erase blocks can be significant, thus it is beneficial to erase unused blocks while the device is idle.
- Random access: Disk file systems are optimized to avoid disk seeks whenever possible, due to the high cost of seeking. Flash memory devices impose no seek latency.
- Wear levelling: Flash memory devices tend to wear out when a single block is repeatedly overwritten; flash file systems are designed to spread out writes evenly.
Log-structured file systems have many of the desirable properties for a flash file system. Such file systems include JFFS2 and YAFFS.
Tape file systems
A tape file system is a file system and tape format designed to store files on tape in a self-describing form. Magnetic tapes are sequential storage media, posing challenges to the creation and efficient management of a general-purpose file system. IBM has recently announced a new file system for tape called the Linear Tape File System. The IBM implementation of this file system has been released as the open-source IBM Long Term File System product.
Database file systems
A new concept for file management is the concept of a database-based file system. Instead of, or in addition to, hierarchical structured management, files are identified by their characteristics, like type of file, topic, author, or similar metadata.
Transactional file systems
Some programs need to update multiple files «all at once.» For example, a software installation may write program binaries, libraries, and configuration files. If the software installation fails, the program may be unusable. If the installation is upgrading a key system utility, such as the command shell, the entire system may be left in an unusable state.
Transaction processing introduces the isolation guarantee, which states that operations within a transaction are hidden from other threads on the system until the transaction commits, and that interfering operations on the system will be properly serialized with the transaction.
Transactions also provide the
atomicity
guarantee, that operations inside of a transaction are either all committed, or the transaction can be aborted and the system discards all of its partial results. This means that if there is a crash or power failure, after recovery, the stored state will be consistent. Either the software will be completely installed or the failed installation will be completely rolled back, but an unusable partial install will not be left on the system.
Windows, beginning with Vista, added transaction support to NTFS, abbreviated TxF. TxF is the only commercial implementation of a transactional file system, as transactional file systems are difficult to implement correctly in practice. There are a number of research prototypes of transactional file systems for UNIX systems, including the Valor file system[1], Amino[2], LFS [3], and a transactional ext3 file system on the TxOS kernel[4],
as well as transactional file systems targeting embedded systems, such as TFFS [5].
Ensuring consistency across multiple file system operations is difficult, if not impossible, without file system transactions. File locking can be used as a concurrency control mechanism for individual files, but it typically does not protect the directory structure or file metadata. For instance, file locking cannot prevent TOCTTOU race conditions on symbolic links.
File locking also cannot automatically roll back a failed operation, such as a software upgrade; this requires atomicity.
Journaling file systems are one technique used to introduce transaction-level consistency to file system structures. Journal transactions are not exposed to programs as part of the OS API; they are only used internally to ensure consistency at the granularity of a single system call.
Network file systems
-
Main article: Network file system
A network file system is a file system that acts as a client for a remote file access protocol, providing access to files on a server. Examples of network file systems include clients for the NFS, AFS, SMB protocols, and file-system-like clients for FTP and WebDAV.
Shared disk file systems
-
Main article: Shared disk file system
A shared disk file system is one in which a number of machines (usually servers) all have access to the same external disk subsystem (usually a SAN). The file system arbitrates access to that subsystem, preventing write collisions. Examples include GFS from Red Hat, GPFS from IBM, and SFS from DataPlow.
Special purpose file systems
-
Main article: Special file system
A special purpose file system is basically any file system that is not a disk file system or network file system. This includes systems where the files are arranged dynamically by software, intended for such purposes as communication between computer processes or temporary file space.
Special purpose file systems are most commonly used by file-centric operating systems such as Unix. Examples include the procfs (/proc) file system used by some Unix variants, which grants access to information about processes and other operating system features.
Deep space science exploration craft, like Voyager I and II used digital tape-based special file systems. Most modern space exploration craft like Cassini-Huygens used Real-time operating system file systems or RTOS influenced file systems. The Mars Rovers are one such example of an RTOS file system, important in this case because they are implemented in flash memory.
File systems and operating systems
Most operating systems provide a file system, as a file system is an integral part of any modern operating system. Early microcomputer operating systems’ only real task was file management — a fact reflected in their names (see DOS). Some early operating systems had a separate component for handling file systems which was called a disk operating system. On some microcomputers, the disk operating system was loaded separately from the rest of the operating system. On early operating systems, there was usually support for only one, native, unnamed file system; for example, CP/M supports only its own file system, which might be called «CP/M file system» if needed, but which didn’t bear any official name at all.
Because of this, there needs to be an interface provided by the operating system software between the user and the file system. This interface can be textual (such as provided by a command line interface, such as the Unix shell, or OpenVMS DCL) or graphical (such as provided by a graphical user interface, such as file browsers). If graphical, the metaphor of the folder, containing documents, other files, and nested folders is often used (see also: directory and folder).
Flat file systems
In a flat file system, there are no subdirectories—everything is stored at the same (root) level on the media, be it a hard disk, floppy disk, etc. While simple, this system rapidly becomes inefficient as the number of files grows, and makes it difficult for users to organize data into related groups.
Like many small systems before it, the original Apple Macintosh featured a flat file system, called Macintosh File System. Its version of Mac OS was unusual in that the file management software (Macintosh Finder) created the illusion of a partially hierarchical filing system on top of EMFS. This structure meant that every file on a disk had to have a unique name, even if it appeared to be in a separate folder. MFS was quickly replaced with Hierarchical File System, which supported real directories.
A recent addition to the flat file system family is Amazon‘s S3, a remote storage service, which is intentionally simplistic to allow users the ability to customize how their data is stored. The only constructs are buckets (imagine a disk drive of unlimited size) and objects (similar, but not identical to the standard concept of a file). Advanced file management is allowed by being able to use nearly any character (including ‘/’) in the object’s name, and the ability to select subsets of the bucket’s content based on identical prefixes.
File systems under Unix-like operating systems
Unix-like operating systems create a virtual file system, which makes all the files on all the devices appear to exist in a single hierarchy. This means, in those systems, there is one root directory, and every file existing on the system is located under it somewhere. Unix-like systems can use a RAM disk or network shared resource as its root directory.
Unix-like systems assign a device name to each device, but this is not how the files on that device are accessed. Instead, to gain access to files on another device, the operating system must first be informed where in the directory tree those files should appear. This process is called mounting a file system. For example, to access the files on a CD-ROM, one must tell the operating system «Take the file system from this CD-ROM and make it appear under such-and-such directory». The directory given to the operating system is called the mount point – it might, for example, be /media. The /media directory exists on many Unix systems (as specified in the Filesystem Hierarchy Standard) and is intended specifically for use as a mount point for removable media such as CDs, DVDs, USB drives or floppy disks. It may be empty, or it may contain subdirectories for mounting individual devices. Generally, only the administrator (i.e. root user) may authorize the mounting of file systems.
Unix-like operating systems often include software and tools that assist in the mounting process and provide it new functionality. Some of these strategies have been coined «auto-mounting» as a reflection of their purpose.
- In many situations, file systems other than the root need to be available as soon as the operating system has booted. All Unix-like systems therefore provide a facility for mounting file systems at boot time. System administrators define these file systems in the configuration file fstab or vfstab in Solaris Operating Environment, which also indicates options and mount points.
- In some situations, there is no need to mount certain file systems at boot time, although their use may be desired thereafter. There are some utilities for Unix-like systems that allow the mounting of predefined file systems upon demand.
- Removable media have become very common with microcomputer platforms. They allow programs and data to be transferred between machines without a physical connection. Common examples include USB flash drives, CD-ROMs, and DVDs. Utilities have therefore been developed to detect the presence and availability of a medium and then mount that medium without any user intervention.
- Progressive Unix-like systems have also introduced a concept called supermounting; see, for example, the Linux supermount-ng project. For example, a floppy disk that has been supermounted can be physically removed from the system. Under normal circumstances, the disk should have been synchronized and then unmounted before its removal. Provided synchronization has occurred, a different disk can be inserted into the drive. The system automatically notices that the disk has changed and updates the mount point contents to reflect the new medium. Similar functionality is found on Windows machines.
- A similar innovation preferred by some users is the use of autofs, a system that, like supermounting, eliminates the need for manual mounting commands. The difference from supermount, other than compatibility in an apparent greater range of applications such as access to file systems on network servers, is that devices are mounted transparently when requests to their file systems are made, as would be appropriate for file systems on network servers, rather than relying on events such as the insertion of media, as would be appropriate for removable media.
File systems under Linux
Linux supports many different file systems, but common choices for the system disk include the ext* family (such as ext2, ext3 and ext4), XFS, JFS, ReiserFS and btrfs.
File systems under Solaris
The Sun Microsystems Solaris operating system in earlier releases defaulted to (non-journaled or non-logging) UFS for bootable and supplementary file systems. Solaris defaulted to, supported, and extended UFS.
Support for other file systems and significant enhancements were added over time, including Veritas Software Corp. (Journaling) VxFS, Sun Microsystems (Clustering) QFS, Sun Microsystems (Journaling) UFS, and Sun Microsystems (open source, poolable, 128 bit compressible, and error-correcting) ZFS.
Kernel extensions were added to Solaris to allow for bootable Veritas VxFS operation. Logging or Journaling was added to UFS in Sun’s Solaris 7. Releases of Solaris 10, Solaris Express, OpenSolaris, and other open source variants of the Solaris operating system later supported bootable ZFS.
Logical Volume Management allows for spanning a file system across multiple devices for the purpose of adding redundancy, capacity, and/or throughput. Legacy environments in Solaris may use Solaris Volume Manager (formerly known as Solstice DiskSuite.) Multiple operating systems (including Solaris) may use Veritas Volume Manager. Modern Solaris based operating systems eclipse the need for Volume Management through leveraging virtual storage pools in ZFS.
File systems under Mac OS X
Mac OS X uses a file system that it inherited from classic Mac OS called HFS Plus, sometimes called Mac OS Extended. HFS Plus is a metadata-rich and case preserving file system. Due to the Unix roots of Mac OS X, Unix permissions were added to HFS Plus. Later versions of HFS Plus added journaling to prevent corruption of the file system structure and introduced a number of optimizations to the allocation algorithms in an attempt to defragment files automatically without requiring an external defragmenter.
Filenames can be up to 255 characters. HFS Plus uses Unicode to store filenames. On Mac OS X, the filetype can come from the type code, stored in file’s metadata, or the filename.
HFS Plus has three kinds of links: Unix-style hard links, Unix-style symbolic links and aliases. Aliases are designed to maintain a link to their original file even if they are moved or renamed; they are not interpreted by the file system itself, but by the File Manager code in userland.
Mac OS X also supports the UFS file system, derived from the BSD Unix Fast File System via NeXTSTEP. However, as of Mac OS X 10.5 (Leopard), Mac OS X can no longer be installed on a UFS volume, nor can a pre-Leopard system installed on a UFS volume be upgraded to Leopard.[6]
File systems under Plan 9 from Bell Labs
Plan 9 from Bell Labs was originally designed to extend some of Unix’s good points, and to introduce some new ideas of its own while fixing the shortcomings of Unix.
With respect to file systems, the Unix system of treating things as files was continued, but in Plan 9, everything is treated as a file, and accessed as a file would be (i.e., no ioctl or mmap). Perhaps surprisingly, while the file interface is made universal it is also simplified considerably: symlinks, hard links and suid are made obsolete, and an atomic create/open operation is introduced. More importantly the set of file operations becomes well defined and subversions of this like ioctl are eliminated.
Secondly, the underlying 9P protocol was used to remove the difference between local and remote files (except for a possible difference in latency or in throughput). This has the advantage that a device or devices, represented by files, on a remote computer could be used as though it were the local computer’s own device(s). This means that under Plan 9, multiple file servers provide access to devices, classing them as file systems. Servers for «synthetic» file systems can also run in user space bringing many of the advantages of micro kernel systems while maintaining the simplicity of the system.
Everything on a Plan 9 system has an abstraction as a file; networking, graphics, debugging, authentication, capabilities, encryption, and other services are accessed via I-O operations on file descriptors. For example, this allows the use of the IP stack of a gateway machine without need of NAT, or provides a network-transparent window system without the need of any extra code.
Another example: a Plan-9 application receives FTP service by opening an FTP site. The ftpfs server handles the open by essentially mounting the remote FTP site as part of the local file system. With ftpfs as an intermediary, the application can now use the usual file-system operations to access the FTP site as if it were part of the local file system. A further example is the mail system which uses file servers that synthesize virtual files and directories to represent a user mailbox as /mail/fs/mbox. The wikifs provides a file system interface to a wiki.
These file systems are organized with the help of private, per-process namespaces, allowing each process to have a different view of the many file systems that provide resources in a distributed system.
The Inferno operating system shares these concepts with Plan 9.
File systems under Microsoft Windows
File:DirectoryListing1.png Directory listing in a Windows command shell
Windows makes use of the FAT and NTFS file systems.
FAT
The File Allocation Table (FAT) filing system, supported by all versions of Microsoft Windows, was an evolution of that used in Microsoft’s earlier operating system (MS-DOS which in turn was based on 86-DOS). FAT ultimately traces its roots back to the short-lived M-DOS project and Standalone disk BASIC before it. Over the years various features have been added to it, inspired by similar features found on file systems used by operating systems such as Unix.
Older versions of the FAT file system (FAT12 and FAT16) had file name length limits, a limit on the number of entries in the root directory of the file system and had restrictions on the maximum size of FAT-formatted disks or partitions. Specifically, FAT12 and FAT16 had a limit of 8 characters for the file name, and 3 characters for the extension (such as .exe). This is commonly referred to as the 8.3 filename limit. VFAT, which was an extension to FAT12 and FAT16 introduced in Windows NT 3.5 and subsequently included in Windows 95, allowed long file names (LFN).
FAT32 also addressed many of the limits in FAT12 and FAT16, but remains limited compared to NTFS.
exFAT (also known as FAT64) is the newest iteration of FAT, with certain advantages over NTFS with regards to file system overhead. exFAT is only compatible with newer Windows systems, such as Windows 2003, Windows Vista, Windows 2008, Windows 7 and more recently, support has been added for WinXP[7].
NTFS
NTFS, introduced with the Windows NT operating system, allowed ACL-based permission control. Hard links, multiple file streams, attribute indexing, quota tracking, sparse files, encryption, compression, reparse points (directories working as mount-points for other file systems, symlinks, junctions, remote storage links) are also supported, though not all these features are well-documented.
Unlike many other operating systems, Windows uses a drive letter abstraction at the user level to distinguish one disk or partition from another. For example, the path C:WINDOWS represents a directory WINDOWS on the partition represented by the letter C. The C drive is most commonly used for the primary hard disk partition, on which Windows is usually installed and from which it boots. This «tradition» has become so firmly ingrained that bugs came about in older applications which made assumptions that the drive that the operating system was installed on was C. The tradition of using «C» for the drive letter can be traced to MS-DOS, where the letters A and B were reserved for up to two floppy disk drives. This in turn derived from CP/M in the 1970s, which however used A: and B: for hard drives, and C: for floppy disks, and ultimately from IBM’s CP/CMS of 1967.
Network drives may also be mapped to drive letters.
File systems under OpenVMS
-
Main article: Files-11
File systems under MVS [IBM Mainframe]
-
Main article: MVS#MVS filesystem
Other file systems
- The Prospero File System is a file system based on the Virtual System Model.[clarification needed] The system was created by Dr. B. Clifford Neuman of the Information Sciences Institute at the University of Southern California.[8]
- RSRE FLEX file system — written in ALGOL 68
- The file system of the Michigan Terminal System (MTS) is interesting because: (i) it provides «line files» where record lengths and line numbers are associated as metadata with each record in the file, lines can be added, replaced, updated with the same or different length records, and deleted anywhere in the file without the need to read and rewrite the entire file; (ii) using program keys files may be shared or permitted to commands and programs in addition to users and groups; and (iii) there is a comprehensive file locking mechanism that protects both the file’s data and its metadata.[9][10]
See also
- Comparison of file systems
- Directory structure
- Disk sharing
- Distributed file system
- Filename extension
- File manager
- File system fragmentation
- Filesystem API
- Physical and logical storage
- List of file systems
- List of Unix programs
- Virtual file system
- Storage efficiency
References
Cited references
- ↑ Spillane, Richard; Gaikwad, Sachin; Chinni, Manjunath; Zadok, Erez and Wright, Charles P.; 2009; «Enabling transactional file access via lightweight kernel extensions»; Seventh USENIX Conference on File and Storage Technologies (FAST 2009)
- ↑ Wright, Charles P.; Spillane, Richard; Sivathanu, Gopalan; Zadok, Erez; 2007; «Extending ACID Semantics to the File System; ACM Transactions on Storage
- ↑ Selzter, Margo I.; 1993; «Transaction Support in a Log-Structured File System»; Proceedings of the Ninth International Conference on Data Engineering
- ↑ Porter, Donald E.; Hofmann, Owen S.; Rossbach, Christopher J.; Benn, Alexander and Witchel, Emmett; 2009; «Operating System Transactions»; In the Proceedings of the 22nd ACM Symposium on Operating Systems Principles (SOSP ’09), Big Sky, MT, October 2009.
- ↑ Gal, Eran; Toledo, Sivan; «A Transactional Flash File System for Microcontrollers»
- ↑ [http://docs.info.apple.com/article.html?artnum=306516 Mac OS X 10.5 Leopard: Installing on a UFS-formatted volume
Newer versions Mac OS X are capable of reading and writing to the legacy FAT file systems(16 & 32). They are capable of reading, but not writing to the NTFS file system. Third party software is still necessary to write to the NTFS file system under Snow Leopard 10.6.2.
] - ↑ Microsoft WinXP exFat patch http://www.microsoft.com/downloads/details.aspx?FamilyID=1cbe3906-ddd1-4ca2-b727-c2dff5e30f61&displaylang=en
- ↑ http://www.cs.ucsb.edu/~ravenben/papers/fsml/prospero-gfsvsm.ps.gz
- ↑ «A file system for a general-purpose time-sharing environment», G. C. Pirkola, Proceedings of the IEEE, June 1975, volume 63 no. 6, pp. 918–924, ISSN 0018-9219
- ↑ «The Protection of Information in a General Purpose Time-Sharing Environment», Gary C. Pirkola and John Sanguinetti, Proceedings of the IEEE Symposium on Trends and Applications 1977: Computer Security and Integrity, vol. 10 no. 4, , pp. 106-114
General references
- Jonathan de Boyne Pollard (1996). «Disc and volume size limits». Frequently Given Answers. Retrieved February 9, 2005.
- IBM. «OS/2 corrective service fix JR09427». Retrieved February 9, 2005.
- «Attribute — $EA_INFORMATION (0xD0)». NTFS Information, Linux-NTFS Project. Retrieved February 9, 2005.
- «Attribute — $EA (0xE0)». NTFS Information, Linux-NTFS Project. Retrieved February 9, 2005.
- «Attribute — $STANDARD_INFORMATION (0x10)». NTFS Information, Linux-NTFS Project. Retrieved February 21, 2005.
- Apple Computer Inc. «Technical Note TN1150: HFS Plus Volume Format». Detailed HFS Plus and HFSX description. Retrieved May 2, 2006.
- File System Forensic Analysis, Brian Carrier, Addison Wesley, 2005.
Further reading
- Books
- Carrier, Brian (2005). File System Forensic Analysis. Addison-Wesley. ISBN 0321268172.
- Custer, Helen (1994). Inside the Windows NT File System. Microsoft Press. ISBN 155615660X.
- Giampaolo, Dominic (1999) (PDF). Practical File System Design with the Be File System. Morgan Kaufmann Publishers. ISBN 1558604979. Retrieved 2010-01-22.
- McCoy, Kirby (1990). VMS File System Internals. VAX — VMS Series. Digital Press. ISBN 1555580564.
- Mitchell, Stan (1997). Inside the Windows 95 File System. O’Reilly. ISBN 156592200X.
- Nagar, Rajeev (1997). Windows NT File System Internals : A Developer’s Guide. O’Reilly. ISBN 9781565922495.
- Pate, Steve D. (2003). UNIX Filesystems: Evolution, Design, and Implementation. Wiley. ISBN 0471164836.
- Rosenblum, Mendel (1994). The Design and Implementation of a Log-Structured File System. The Springer International Series in Engineering and Computer Science. Springer. ISBN 0792395417.
- Russinovich, Mark; Solomon, David A.; Ionescu, Alex (2009). «File Systems». Windows Internals (5th ed.). Microsoft Press. ISBN 0735625301.
- Prabhakaran, Vijayan (2006). IRON File Systems. PhD disseration, University of Wisconsin-Madison.
- Silberschatz, Abraham; Galvin, Peter Baer; Gagne, Greg (2004). «Storage Management». Operating System Concepts (7th ed.). Wiley. ISBN 0471694665.
- Tanenbaum, Andrew S. (2007). «File Systems». Modern operating Systems (3rd ed.). Prentice Hall. ISBN 0136006639.
- Tanenbaum, Andrew S.; Woodhull, Albert S. (2006). «File Systems». Operating Systems: Design and Implementation (3rd ed.). Prentice Hall. ISBN 0131429388.
- Online articles
- Benchmarking Filesystems (outdated) by Justin Piszcz, Linux Gazette 102, May 2004
- Benchmarking Filesystems Part II using kernel 2.6, by Justin Piszcz, Linux Gazette 122, January 2006
- Filesystems (ext3, ReiserFS, XFS, JFS) comparison on Debian Etch
- Interview With the People Behind JFS, ReiserFS & XFS
- Journal File System Performance (outdated): ReiserFS, JFS, and Ext3FS show their merits on a fast RAID appliance
- Journaled Filesystem Benchmarks (outdated): A comparison of ReiserFS, XFS, JFS, ext3 & ext2
- Large List of File System Summaries
- Linux File System Benchmarks v2.6 kernel with a stress on CPU usage
- Linux Filesystem Benchmarks
- Linux large file support (outdated)
- Local Filesystems for Windows
- Overview of some filesystems (outdated)
- Sparse files support (outdated)
- Jeremy Reimer (March 16, 2008). «From BFS to ZFS: past, present, and future of file systems». arstechnica.com. Retrieved 2008-03-18.
External links
- Filesystem Specifications — Links & Whitepapers
- Interesting File System Projects
Have you ever needed to format a new hard drive or USB drive, and were given the option of selecting from acronyms like FAT, FAT32, or NTFS? Or did you once try plugging in an external device, only for your operating system to have trouble understanding it? Here’s another one… do you sometimes simply get frustrated by how long it takes your OS to find a particular file while searching?
If you have experienced any of the above, or simply just pointed-and-clicked your way to find a file or application on your computer, then you’ve had first-hand experience into what a file system is.
Many people might not employ an explicit methodology for organizing their personal files on a PC (explainer_file_system_final_actualfinal_FinalDraft.docx). However, the abstract concept of organizing files and directories for any device with persistent memory needs to be very systematic when reading, writing, copying, deleting, and interfacing with data. This job of the operating system is typically assigned to the file system.
There are many different ways to organize files and directories. If you can simply imagine a physical file cabinet with papers and folders, you would need to consider many things when coming up with a system for retrieving your documents. Would you organize the folders in alphabetical, or reverse alphabetical order? Would you prioritize commonly accessed files in the front or back of the file cabinet? How would you deal with duplicates, whether on purpose (for redundancy) or accidental (naming two files exactly the same way)? These are just a few analogous questions that need answering when developing a file system.
In this explainer, we’ll take a deep dive into how modern day computers tackle these problems. We’ll go over the various roles of a file system in the larger context of an operating system and physical drives, in addition to how file systems are designed and implemented.
Persistent Data: Files and Directories
Modern operating systems are increasingly complex, and need to manage various hardware resources, schedule processes, virtualize memory, among many other tasks. When it comes to data, many hardware advances such as caches and RAMs have been designed to speed up access time, and ensure that frequently used data is «nearby» the processor. However, when you power down your computer, only the information stored on persistent devices, such as hard disk drives (HDDs) or solid-state storage devices (SSDs), will remain beyond the power off cycle. Thus, the OS must take extra care of these devices and the data onboard, since this is where users will keep data they really care about.
Two of the most important abstractions developed over time for storage are the file and the directory. A file is a linear array of bytes, each of which you can read or write. While at the user space we can think of clever names for our files, underneath the hood there are typically numerical identifiers to keep track of file names. Historically, this underlying data structure is often referred to as its inode number (more on that later). Interestingly, the OS itself does not know much about the internal structure of a file (i.e., is it a picture, video, or text file); in fact, all it needs to know is how to write the bytes into the file for persistent storage, and make sure it can retrieve it later when called upon.
The second main abstraction is the directory. A directory is actually just a file underneath the hood, but contains a very specific set of data: a list of user-readable names to low-level name mappings. Practically speaking, that means it contains a list of other directories or files, which altogether can form a directory tree, under which all files and directories are stored.
Such an organization is quite expressive and scalable. All you need is a pointer to the root of the directory tree (physically speaking, that would be to the first inode in the system), and from there you can access any other files on that disk partition. This system also allows you to create files with the same name, so long as they do not have the same path (i.e., they fall under different locations in the file-system tree).
Additionally, you can technically name a file anything you want! While it is typically conventional to denote the type of file with a period separation (such as .jpg in picture.jpg), that is purely optional and isn’t mandatory. Some operating systems such as Windows heavily suggest using these conventions in order to open files in the respective application of choice, but the content of the file itself isn’t dependent on the file extension. The extension is just a hint for the OS on how to interpret the bytes contained inside a file.
Once you have files and directories, you need to be able to operate on them. In the context of a file system, that means being able to read the data, write data, manipulate files (delete, move, copy, etc.), and manage permissions for files (who can perform all the operations above?). How are modern file systems implemented to allow for all these operations to happen quickly and in a scalable fashion?
File System Organization
When thinking about a file system, there are typically two aspects that need to be addressed. The first is the data structures of the file system. In other words, what types of on-disk structures are used by the file system to organize its data and metadata? The second aspect is its access methods: how can a process open, read, or write onto its structures?
Let’s begin by describing the overall on-disk organization of a rudimentary file system.
The first thing you need to do is to divide your disk into blocks. A commonly used block size is 4 KB. Let’s assume you have a very small disk with 256 KB of storage space. The first step is to divide this space evenly using your block size, and identify each block with a number (in our case, labeling the blocks from 0 to 63):
Now, let’s break up these blocks into various regions. Let’s set aside most of the blocks for user data, and call this the data region. In this example, let’s fix blocks 8-63 as our data region:
If you noticed, we put the data region in the latter part of the disk, leaving the first few blocks for the file system to use for a different purpose. Specifically, we want to use them to track information about files, such as where a file might be in the data region, how large is a file, its owner and access rights, and other types of information. This information is a key piece of the file system, and is called metadata.
To store this metadata, we will use a special data structure called an inode. In the running example, let’s set aside 5 blocks as inodes, and call this region of the disk the inode table:
Illustrations borrowed from book: Operating Systems: Three Easy Pieces
Inodes are typically not that big, for example 256 bytes. Thus, a 4KB block can hold about 16 inodes, and our simple file system above contains 80 total inodes. This number is actually significant: it means that the maximum number of files in our file system is 80. With a larger disk, you can certainly increase the number of inodes, directly translating to more files in your file system.
There are a few things remaining to complete our file system. We also need a way to keep track of whether inodes or data blocks are free or allocated. This allocations structure can be implemented as two separate bitmaps, one for inodes and another for the data region.
A bitmap is a very simple data structure: each bit corresponds to whether an object/block is free (0) or in-use (1). We can assign the inode bitmap and data region bitmap to their own block. Although this is overkill (a block can be used to track up to 32 KB objects, but we only have 80 inodes and 56 data blocks), this is a convenient and simple way to organize our file system.
Finally, for the last remaining block (which, coincidentally, is the first block in our disk), we need to have a superblock. This superblock is sort of a metadata for the metadata: in the block, we can store information about the file system, such as how many inodes there are (80) and where the inode block is found (block 3) and so forth. We can also put some identifier for the file system in the superblock to understand how to interpret nuances and details for different file system types (e.g., we can note that this file system is a Unix-based, ext4 filesystem, or perhaps an NTFS). When the operating system reads the superblock, it can then have a blueprint for how to interpret and access different data on the disk.
Adding a superblock (S), an inode bitmap (i), and a data region bitmap (d) to our simple system.
The Inode
So far, we’ve mentioned the inode data structure in a file system, but have not yet explained what this critical component is. An inode is short for an index node, and is a historical name given from UNIX and earlier file systems. Practically all modern day systems use the concept of an inode, but may call them different things (such as dnodes, fnodes, etc).
Fundamentally though, the inode is an indexable data structure, meaning the information stored on it is in a very specific way, such that you can jump to a particular location (the index) and know how to interpret the next set of bits.
A particular inode is referred to by a number (the i-number), and this is the low-level name of the file. Given an i-number, you can look up it’s information by quickly jumping to its location. For example, from the superblock, we know that the inode region starts from the 12KB address.
Since a disk is not byte-addressable, we have to know which block to access in order to find our inode. With some fairly simple math, we can compute the block ID based on the i-number of interest, the size of each inode, and the size of a block. Subsequently, we can find the start of the inode within the block, and read the desired information.
The inode contains virtually all of the information you need about a file. For example, is it a regular file or a directory? What is its size? How many blocks are allocated to it? What permissions are allowed to access the file (i.e., who is the owner, and who can read or write)? When was the file created or last accessed? And many other flags or metadata about the file.
One of the most important pieces of information kept in the inode is a pointer (or list of pointers) on where the data resides in the data region. These are known as direct pointers. The concept is nice, but for very large files, you might run out of pointers in the small inode data structure. Thus, many modern systems have special indirect pointers: instead of directly going to the data of the file in the data region, you can use an indirect block in the data region to expand the number of direct pointers for your file. In this way, files can become much larger than the limited set of direct pointers available in the inode data structure.
Unsurprisingly, you can use this approach to support even larger data types, by having double or triple indirect pointers. This type of file system is known as having a multi-level index, and allows a file system to support large files (think in the gigabytes range) or larger. Common file systems such as ext2 and ext3 use multi-level indexing systems. Newer file systems, such as ext4, have the concept of extents, which are slightly more complex pointer schemes.
While the inode data structure is very popular for its scalability, many studies have been performed to understand its efficacy and extent to which multi-level indices are needed. One study has shown some interesting measurements on file systems, including:
- Most files are actually very small (2KB is the most common size)
- The average file size is growing (almost 200k is the average)
- Most bytes are stored in large files (a few big files use most of the space)
- File systems contain lots of files (almost 100k on average)
- File systems are roughly half full (even as disks grow, files systems remain ~50% full)
- Directories are typically small (many have few entries, 20 or fewer)
This all points to the versatility and the scalability of the inode data structure, and how it supports most modern systems perfectly fine. Many optimizations have been implemented for speed and efficiency, but the core structure has changed little over recent times.
Directories
Under the hood, directories are simply a very specific type of file: they contain a list of entries using (entry name, i-number) pairing system. The entry number is typically a human-readable name, and the corresponding i-number captures its underlying file-system «name.»
Each directory typically also contains 2 additional entries beyond the list of user names: one entry is the «current directory» pointer, and the other is the parent directory pointer. When using a command line terminal, you can «change directory» by typing
- cd [directory or file name]
or move up a directory by using
- cd ..
where «..» is the abstract name of the parent directory pointer.
Since directories are typically just «special files,» managing the contents of a directory is usually as simple as adding and deleting pairings within the file. A directory typically has its own inode in a linear file system tree (as described above), but new data structures such as B-trees have been proposed and used in some modern file systems such as XFS.
Access Methods and Optimizations
A file system would be useless if you could not read and write data to it. For this step, you need a well defined methodology to enable the operating system to access and interpret the bytes in the data region.
The basic operations on a file include opening a file, reading a file, or writing to a file. These procedures require a huge number of input/output operations (I/O), and are typically scattered over the disk. For example, traversing a file system tree from the root node to the file of interest requires jumping from an inode to a directory file (potentially multi-indexed) to the file location. If the file does not exist, then certain additional operations such as creating an inode entry and assigning permissions are required.
Many technologies, both in hardware and software, have been developed to improve access times and interactions with storage. A very common hardware optimization is the use of SSDs, which have much improved access times due to their solid state properties. Hard drives, on the other hand, typically have mechanical parts (a moving spindle) which means there are physical limitations on how fast you can «jump» from one part of the disk to another.
While SSDs provide fast disk accesses, that typically isn’t enough to accelerate reading and writing data. The operating system will commonly use faster, volatile memory structures such as RAM and caches to make the data «closer» to the processor, and accelerate operations. In fact, the operating system itself is typically stored on a file system, and one major optimization is to keep common read-only OS files perpetually in RAM in order to ensure the operating system runs quickly and efficiently.
Without going into the nitty-gritty of file operations, there are some interesting optimizations that are employed for data management. For example, when deleting a file, one common optimization is to simply delete the inode pointing to the data, and effectively marking the disk regions as «free memory.» The data on disk isn’t physically wiped out in this case, but access to it is removed. In order to fully «delete» a file, certain formatting operations can be done to write all zeroes (0) over the disk regions being deleted.
Another common optimization is moving data. As users, we might want to move a file from one directory to another based on our personal organization preferences. The file system, however, just needs to change minimal data in a few directory files, rather than actually shifting bits from one place to another. By using the concept of inodes and pointers, a file system can perform a «move» operation (within the same disk) very quickly.
When it comes to «installing» applications or games, this simply means copying over files to a specific location and setting global variables and flags for making them executable. In Windows, an install typically asks for a directory, and then downloads the data for running the application and places it into that directory. There is nothing particularly special about an install, other than the automated mechanism for writing many files and directories from an external source (online or physical media) into the disk of choice.
Common File Systems
Modern file systems have many detailed optimizations that work hand-in-hand with the operating system to improve performance and provide various features (such as security or large file support). Some of the most popular file systems today include FAT32 (for flash drives and, previously, Windows), NTFS (for Windows), and ext4 (for Linux).
At a high level, all these file systems have similar on-disk structures, but differ in the details and the features that they support. For example, the FAT32 (File Allocation Table) format was initially designed in 1977, and was used in the early days of personal computing. It uses a concept of a linked list for file and directory accesses, which while simple and efficient, can be slow for larger disks. Today, it is a commonly used format for flash drives.
The NTFS (New Technology File System) developed by Microsoft in 1993 addressed many of the humble beginnings of FAT32. It improves performance by storing various additional metadata about files and supports various structures for encryption, compression, sparse files, and system journaling. NTFS is still used today in Windows 10 and 11. Similarly, macOS and iOS devices use a proprietary file system created by Apple, HFS+ (also known as Mac OS Extended) used to be the standard before they introduced the Apple File System (APFS) relatively recently in 2017 and is better optimized for faster storage mediums as well as for supporting advanced capabilities like encryption and increased data integrity.
The fourth extended filesystem, or ext4, is the fourth iteration of the ext file system developed in 2008 and the default system for many Linux distributions including Debian and Ubuntu. It can support large file sizes (up to 16 tebibytes), and uses the concept of extents to further enhance inodes and metadata for files. It uses a delayed allocation system to reduce writes to disk, and has many improvements for filesystem checksums for data integrity, and is also supported by both Windows and Mac.
Each file system provides its own set of features and optimizations, and may have many implementation differences. However, fundamentally, they all carry out the same functionality of supporting files and interacting with data on disk. Certain file systems are optimized to work better with different operating systems, which is why the file system and operating system are very closely intertwined.
Next-Gen File Systems
One of the most important features of a file system is its resilience to errors. Hardware errors can occur for a variety of reasons, including wear-out, random voltage spikes or droops (from processor overclocking or other optimizations), random alpha particle strikes (also called soft errors), and many other causes. In fact, hardware errors are such a costly problem to identify and debug, that both Google and Facebook have published papers about how important resilience is at scale, particularly in data centers.
One of the most important features of a file system is its resilience to errors.
To that end, most next-gen file systems are focusing on faster resiliency and fast(er) security. These features come at a cost, typically incurring a performance penalty in order to incorporate more redundancy or security features into the file system.
Hardware vendors typically include various protection mechanisms for their products such as ECC protection for RAM, RAID options for disk redundancy, or full-blown processor redundancy such as Tesla’s recent Fully Self-Driving Chip (FSD). However, that additional layer of protection in software via the file system is just as important.
Microsoft has been working on this problem for many years now in its Resilient File System (ReFS) implementation. ReFS was originally released for Windows Server 2012, and is meant to succeed NTFS. ReFS uses B+ trees for all their on-disk structures (including metadata and file data), and has a resiliency-first approach for implementation. This includes checksums for all metadata stored independently, and an allocation-on-write policy. Effectively, this reduces the burden on administrators from needing to run periodic error-checking tools such as CHKDSK when using ReFS.
In the open-source world, Btrfs (pronounced «better FS» or «Butter FS») is gaining traction with similar features to ReFS. Again, the primary focus is on fault-tolerance, self-healing properties, and easy administration. It also provides better scalability than ext4, allowing roughly 16x more data support.
Summary
While there are many different file systems in use today, the main objective and high-level concepts have changed little over time. To build a file system, you need some basic information about each file (metadata) and a scalable storage structure to write and read from various files.
The underlying implementation of inodes and files together form a very extensible system, which has been fine-tuned and tweaked to provide us with modern file systems. While we may not think about file systems and their features in our day-to-day lives, it is a true testament to their robustness and scalable design which have enabled us to enjoy and access our digital data on computers, phones, consoles, and various other systems.
More Tech Explainers
- What is Crypto Mining?
- What is Chip Binning?
- Explainer: L1 vs. L2 vs. L3 Cache
- What Is a Checksum, and What Can You Do With It?
- Display Tech Compared: TN vs. VA vs. IPS
Masthead image: Jelle Dekkers
Computers use particular kinds of file systems to store and organize data on media, such as a hard drive or flash drive, or the CDs, DVDs, and BDs in an optical drive.
A file system can be thought of as an index or database containing the physical location of every piece of data on the device. The data is usually organized in folders called directories, which can contain other folders and files.
Any place that a computer or other electronic device stores data employs some type of file system. This includes your Windows computer, your Mac, your smartphone, your bank’s ATM—even the computer in your car!
Windows File Systems
The Microsoft Windows operating systems have always supported various versions of the FAT file system. FAT stands for File Allocation Table, a term that describes what it does: maintains a table of each file’s space allocation.
In addition to FAT, all Windows operating systems since Windows NT support a newer file system called NTFS—New Technology File System. For Windows NT, the NT stood for new technology.
All modern versions of Windows also support exFAT, which is designed for flash drives.
ReFS (Resilient File System) is a newer file system for Windows 11, 10, and 8 that includes features not available with NTFS, but it’s currently limited in several ways. You can see which versions of Windows support each version of ReFS in this table.
A file system is set up on a drive during a format. See How to Format a Hard Drive for more information.
More About File Systems
Files on a storage device are kept in sectors. Sectors marked as unused can store data, typically in groups of sectors called blocks. It’s the file system that identifies the size and position of the files, as well as which sectors are ready to be used.
Over time, because of the way the file system stores data, writing to and deleting from a storage device causes fragmentation because of the gaps that inevitably occur between different parts of a file. A free defrag utility can help fix that.
Without a structure for organizing files, it not only would be next to impossible to remove installed programs and retrieve specific files, but no two files could exist with the same name because everything might be in the same folder (which is one reason folders are so useful).
What’s meant by files with the same name is like an image, for example. The file IMG123.jpg can exist in hundreds of folders because each folder is used to separate the file, so there isn’t a conflict. However, files can’t bear the same name if they’re in the same directory.
A file system doesn’t just store the files but also information about them, like the sector block size, fragment information, file size, attributes, file name, file location, and directory hierarchy.
Some operating systems other than Windows also take advantage of FAT and NTFS, but many kinds of file systems dot the operating-system horizon, like HFS+ used in Apple product like iOS and macOS. Wikipedia has a comprehensive list of file systems if you’re more interested in the topic.
Sometimes, the term «file system» is used in the context of partitions. For example, saying «there are two file systems on my hard drive» doesn’t mean that the drive is split between NTFS and FAT, but that there are two separate partitions that use the same physical disk.
Most applications you come into contact with require a file system in order to work, so every partition should have one. Also, programs are file system-dependant, meaning you can’t use a program on Windows if it was built for use in macOS.
Thanks for letting us know!
Get the Latest Tech News Delivered Every Day
Subscribe