Tabs versus Spaces in Source Code
Xah Lee, 2006-05-13
In coding a computer program, there's often the choices of tabs or
spaces for code indentation. There is a large amount of confusion about
which is better. It has become what's known as “religious war” —
a heated fight over trivia. In this essay, i like to explain what is
the situation behind it, and which is proper.
Simply put, tabs is proper, and spaces are improper. Why? This may seem
ridiculously simple given the de facto ball of confusion: the semantics
of tabs is what indenting is about, while, using spaces to align code
is a hack.
Now, tech geekers may object this simple conclusion because they itch
to drivel about different editors and so on. The alleged problem
created by tabs as seen by the industry coders are caused by two
things: (1) tech geeker's sloppiness and lack of critical thinking
which lead them to not understanding the semantic purposes of tab and
space characters. (2) Due to the first reason, they have created and
propagated a massive none-understanding and mis-use, to the degree that
many tools (e.g. vi) does not deal with tabs well and using spaces to
align code has become widely practiced, so that in the end spaces seem
to be actually better by popularity and seeming simplicity.
In short, this is a phenomenon of misunderstanding begetting a snowball
of misunderstanding, such that it created a cultural milieu to embrace
this malpractice and kick what is true or proper. Situations like this
happens a lot in unix. For one non-unix example, is the file name's
suffix known as “extension”, where the code of file's type became
part of the file name. (e.g. “.txt”, “.html”, “.jpg”).
Another well-known example is HTML practices in the industry, where
badly designed tags from corporation's competitive greed, and stupid
coding and misunderstanding by coders and their tools are so
wide-spread such that they force the correct way to the side by the
eventual standardization caused by sheer quantity of inproper but set
practice.
Now, tech geekers may still object, that using tabs requires the
editors to set their positions, and plain files don't carry that
information. This is a good question, and the solution is to advance
the sciences such that your source code in some way embed such
information. This would be progress. However, this is never thought of
because the “unix philosophies” already conditioned people to hack
and be shallow. In this case, many will simply use the character
intended to separate words for the purpose of indentation or alignment,
and spread the practice with militant drivels.
Now, given the already messed up situation of the tabs vs spaces by the
unixers and unix brain-washing of the coders in the industry... Which
should we use today? I do not have a good proposition, other than just
use whichever that works for you but put more critical thinking into
things to prevent mishaps like this.
Tabs vs Spaces can be thought of as parameters vs hard-coded values, or
HTML vs ascii format, or XML/CSS vs HTML 4, or structural vs visual, or
semantic vs format. In these, it is always easy to convert from the
former to the latter, but near impossible from the latter to the
former. And, that is because the former encodes information that is
lost in the latter. If we look at the issue of tabs vs spaces, indeed,
it is easy to convert tabs to spaces in a source code, but more
difficult to convert from spaces to tabs. Because, tabs as indentation
actually contains the semantic information about indentation. With
spaces, this critical information is lost in space.
This issue is intimately related to another issue in source code:
soft-wrapped lines versus physical, hard-wrapped lines by EOL (end of
line character). This issue has far more consequences than tabs vs
spaces, and the unixer's unthinking has made far-reaching damages in
the computing industry. Due to unix's EOL ways of thinking, it has
created languages based on EOL (just about ALL languages except the
Lisp family and Mathematica) and tools based on EOL (cvs, diff, grep,
and basically every tool in unix), thoughts based on EOL (software
value estimation by counting EOL, hard-coded email quoting system by
“>” prefix, and silent line-truncations in many unix tools), such
that any progress or development towards a “algorithmic code unit”
concept or language syntaxes are suppressed. I have not written a full
account on this issue, but i've touched it in this essay: “The Harm
of hard-wrapping Lines”, at
http://xahlee.org/UnixResource_dir/writ/hard-wrap.html
----
This post is archived at:
http://xahlee.org/UnixResource_dir/writ/tabs_vs_spaces.html
Xah
···@xahlee.org
∑ http://xahlee.org/
Actually, spaces are better for indenting code. The exact amount of
space taken up by one space character will always (or at least tend to
be) the same, while every combination of keyboard driver, operating
system, text editor, content/file format, and character encoding all
change precisely what the tab key does.
There's no use in typing "tab" for indentation when my text editor will
simply convert it to three spaces, or worse, autoindent and mix tabs
with spaces so that I have no idea how many actual whitespace characters
of what kinds are really taking up all that whitespace. I admit it
doesn't usually matter, but then you go back to try and make your code
prettier and find yourself asking "WTF?"
Undoubtedly adding the second spark to the holy war,
Eli
--
The science of economics is the cleverest proof of free will yet
constructed.
If I work on your project, I follow the coding and style standards you
specify.
Likewise if you work on my project you follow the established standards.
Fortunately for you, I am fairly liberal on such matters.
I like to see 4 spaces for indentation. If you use tabs, that's what I
will see, and you're very likely to have your code reformatted by the
automated build process, when the standard copyright header is pasted
and missing javadoc tags are generated as warnings.
I like the open brace to start on the line of the control keyword. I
can deal with the open brace being on the next line, at the same level
of indentation as the control keyword. I don't quite understand the
motivation behind the GNU style, where the brace itself is treated as a
half-indent, but I can live with it on *your* project.
Any whitespace or other style that isn't happy to be reformatted
automatically is an error anyway.
I'd be very laissez-faire about it except for the fact that code
repositories are much easier to manage if everything is formatted before
it goes in, or as a compromise, as a step at release tags.
Spaces work better. Hitting the TAB key in my Emacs will auto-indent
the current line. Only spaces will be used for fill. The worst thing
you can do is mix the two regardless of how you feel about tab vs
space.
The next step in evil is to give tab actual significance like in
make.
Xah Lee is getting better at trolling. He might fill up Google's
storage.
--
http://www.david-steuber.com/
1998 Subaru Impreza Outback Sport
2006 Honda 599 Hornet (CB600F) x 2 Crash & Slider
The lithobraker. Zero distance stops at any speed.
David Steuber wrote:
> The worst thing
> you can do is mix the two regardless of how you feel about tab vs
> space.
Not so bad; it all gets auto-formatted before it goes in cvs anyway.
If you don't like that policy you don't work on my project.
On 2006-05-15 00:55:35 -0400, jmcgill <·······@email.arizona.edu> said:
> David Steuber wrote:
>> The worst thing
>> you can do is mix the two regardless of how you feel about tab vs
>> space.
>
> Not so bad; it all gets auto-formatted before it goes in cvs anyway.
> If you don't like that policy you don't work on my project.
You do something incredibly cool like auto-formatting code before
checking it in and then you turn around and do something uncool like
still using cvs?
Odd.
Novus
Here are two clauses to put in your .emacs file (or _emacs on Windows)
to make Emacs work as Steuber describes.
(add-hook 'java-mode-hook
`(lambda () (setq indent-tabs-mode nil))) ;; indentation can't insert
tab
(add-hook 'java-mode-hook
'(lambda () (setq tab-width 4))) ;; if a tab gets in, it looks this
big
;; 2, 3, 4, 8: your choice
In article <··············@david-steuber.com>,
David Steuber <·····@david-steuber.com> wrote:
>Spaces work better. Hitting the TAB key in my Emacs will auto-indent
>the current line. Only spaces will be used for fill. The worst thing
>you can do is mix the two regardless of how you feel about tab vs
>space.
This really hits at the crux of the matter if, as me, you largely code
in Java. Sun uses a Java coding standard that includes tab-based full
indents and space-based half indents. And, of course, their full
indents are sometimes 8 spaces and sometimes 1 tab. This makes
Sun-formatted code unreadable unless you have the exact same Tab-size
as Sun does and, quite frankly, 8 spaces to the tab is just too much.
Since I look at the Java API sources every now and then while
programming, I therefore set my Tab to be 8 spaces to make it
readable, and run with 3-space indents in my own source code.
I wouldn't really mind much using Tab as indent instead, but that
cannot coexist very happily with a Sun-mandated 8-space Tab setting.
>The next step in evil is to give tab actual significance like in
>make.
Since I don't tend to use make a lot, I think of it as "quaint" rather
than "evil" :-)
Cheers
Bent D
--
Bent Dalager - ···@pvv.org - http://www.pvv.org/~bcd
powered by emacs
Bent C Dalager wrote:
> In article <··············@david-steuber.com>,
> David Steuber <·····@david-steuber.com> wrote:
> >Spaces work better. Hitting the TAB key in my Emacs will auto-indent
> >the current line. Only spaces will be used for fill. The worst thing
> >you can do is mix the two regardless of how you feel about tab vs
> >space.
>
> This really hits at the crux of the matter if, as me, you largely code
> in Java. Sun uses a Java coding standard that includes tab-based full
> indents and space-based half indents. And, of course, their full
> indents are sometimes 8 spaces and sometimes 1 tab. This makes
> Sun-formatted code unreadable unless you have the exact same Tab-size
> as Sun does and, quite frankly, 8 spaces to the tab is just too much.
The 8 space tab is industry standard. Anything else is an abomination.
What Sun's source code is doing is the only acceptable use of tabs in
source code whatsoever: groups of eight spaces are replaced by a tab in
a maximally greedy way to save space.
Many editors support this mode of indentation: use a greedy number of
tabs instead of spaces, and then add a few spaces of padding to achieve
the indentation.
Vim, Emacs, even the editor in Microsoft's Visual Studio.
> Since I look at the Java API sources every now and then while
> programming, I therefore set my Tab to be 8 spaces to make it
> readable, and run with 3-space indents in my own source code.
If you have to set your tab to be 8 spaces, your editor is a
braindamaged pile of crap.
It should be the default setting, and ideally not even overrideable.
And by golly, there is such a tool.
Microsoft's NOTEPAD.EXE has an 8 space tab, which cannot be changed.
So if your pointy-haired boss ever views a text file that you produced,
and he uses Notepad on his Windows laptop, it behooves you to have
assumed 8 space tabs.
> I wouldn't really mind much using Tab as indent instead, but that
> cannot coexist very happily with a Sun-mandated 8-space Tab setting.
Or would that be Notepad-mandated? Hahahaha.
>The 8 space tab is industry standard. Anything else is an abomination.
It may be a standard, but it's a standard from the days of the ASR-33
TTY, and I suspect possibly from typewriters of the early 1900s. But
even the vintage typewriters I've seen had adjustable tab stops.
>So if your pointy-haired boss ever views a text file that you produced,
>and he uses Notepad on his Windows laptop, it behooves you to have
>assumed 8 space tabs.
I would have the confidence to insist that he use an editor that is
approved by our departmental policies, and I would go further and
require him to observe our coding standards. I have the confidence in
my skills and in my value to the company, to look such a person square
in the eye and explain his problem to him. I can't imagine the
situation you describe ever being a problem.
"Kaz Kylheku" <········@gmail.com> writes:
> The 8 space tab is industry standard. Anything else is an abomination.
Yes, that's the reason why they shouldn't be used for indenting,
they're too wide. Space is good for indenting.
> What Sun's source code is doing is the only acceptable use of tabs in
> source code whatsoever: groups of eight spaces are replaced by a tab in
> a maximally greedy way to save space.
To save what?
http://www.shopping.com/xGS-LaCie-Bigger-Disk~NS-1~linkin_id-3068036
Let me see, the LOC/programmer/month is anything between 40 and
800. Let's take 400 lines of 40 characters, or 16000 c/p/m. We've
been programming for 50 years, there's about 4 million programmers
worldwide, that's 320 GB to store all the code produced ever, half
that if you compress it, and you still have a lot of space to store
movies and mp3.
--
__Pascal Bourguignon__ http://www.informatimago.com/
Un chat errant
se soulage
dans le jardin d'hiver
Shiki
Pascal Bourguignon <···@informatimago.com> wrote:
> Let me see, the LOC/programmer/month is anything between 40 and
> 800.
40 lines of code per programmer per month? Wow! That's less than two
lines per day ON AVERAGE! Suddenly I don't feel so bad about the
occasional unproductive day at work...
--
Chris Smith
Chris Smith <·······@twu.net> writes:
> Pascal Bourguignon <···@informatimago.com> wrote:
>> Let me see, the LOC/programmer/month is anything between 40 and
>> 800.
>
> 40 lines of code per programmer per month? Wow! That's less than two
> lines per day ON AVERAGE! Suddenly I don't feel so bad about the
> occasional unproductive day at work...
40 lines of DEBUGGED, TESTED, DOCUMENTED, and SHIPPED lines of code.
Yeah. Sounds about right. :-(
Alain Picard wrote:
> Chris Smith <·······@twu.net> writes:
>
>
>>Pascal Bourguignon <···@informatimago.com> wrote:
>>
>>>Let me see, the LOC/programmer/month is anything between 40 and
>>>800.
>>
>>40 lines of code per programmer per month? Wow! That's less than two
>>lines per day ON AVERAGE! Suddenly I don't feel so bad about the
>>occasional unproductive day at work...
>
>
> 40 lines of DEBUGGED, TESTED, DOCUMENTED, and SHIPPED lines of code.
>
> Yeah. Sounds about right. :-(
But the objective was to bound the total amount of code. You don't
assume all code is debugged, tested, documented, and shipped?
Patricia
Patricia Shanahan <····@acm.org> writes:
> Alain Picard wrote:
>> Chris Smith <·······@twu.net> writes:
>>
>>>Pascal Bourguignon <···@informatimago.com> wrote:
>>>
>>>>Let me see, the LOC/programmer/month is anything between 40 and
>>>>800.
>>>
>>> 40 lines of code per programmer per month? Wow! That's less than
>>> two lines per day ON AVERAGE! Suddenly I don't feel so bad about
>>> the occasional unproductive day at work...
>> 40 lines of DEBUGGED, TESTED, DOCUMENTED, and SHIPPED lines of code.
>> Yeah. Sounds about right. :-(
>
> But the objective was to bound the total amount of code. You don't
> assume all code is debugged, tested, documented, and shipped?
Well not really, you don't need to keep scratch and prototype code
that's 30 years old. Just keep a few sources snapshoots. But even if
you wanted to keep all the keypresses of all the programmers ever,
that'd be less than 200 TB, quite a realizable capacity today, even so
in a few years.
--
__Pascal Bourguignon__ http://www.informatimago.com/
PLEASE NOTE: Some quantum physics theories suggest that when the
consumer is not directly observing this product, it may cease to
exist or will exist only in a vague and undetermined state.
Pascal Bourguignon <···@informatimago.com> wrote:
+---------------
| Patricia Shanahan <····@acm.org> writes:
| > But the objective was to bound the total amount of code. You don't
| > assume all code is debugged, tested, documented, and shipped?
|
| Well not really, you don't need to keep scratch and prototype code
| that's 30 years old. Just keep a few sources snapshoots. But even if
| you wanted to keep all the keypresses of all the programmers ever,
| that'd be less than 200 TB, quite a realizable capacity today, even so
| in a few years.
+---------------
Indeed, see <http://www.agami.com/products/>. With the AIS-6119
(on the far right), you can get just under 154 TB in a single rack,
today. [That's with 400 GB drives. 192 TB in a single rack, when
the 500 GB drives become available.]
-Rob
p.s. Obligatory disclosure: I'm currently employed by Agami...
-----
Rob Warnock <····@rpw3.org>
627 26th Avenue <URL:http://rpw3.org/>
San Mateo, CA 94403 (650)572-2607
Pascal Bourguignon wrote:
> "Kaz Kylheku" <········@gmail.com> writes:
> > The 8 space tab is industry standard. Anything else is an abomination.
>
> Yes, that's the reason why they shouldn't be used for indenting,
> they're too wide. Space is good for indenting.
>
>
> > What Sun's source code is doing is the only acceptable use of tabs in
> > source code whatsoever: groups of eight spaces are replaced by a tab in
> > a maximally greedy way to save space.
>
> To save what?
> http://www.shopping.com/xGS-LaCie-Bigger-Disk~NS-1~linkin_id-3068036
Exactly. So the only justification for these tabs is very flimsy, isn't
it?
Maybe it's the cycles. Think of all the cycles that are wasted
lexically analyzing eight spaces compared to one tab.
:)
On 16 May 2006 11:04:38 -0700, "Kaz Kylheku" <········@gmail.com>
wrote, quoted or indirectly quoted someone who said :
>
>The 8 space tab is industry standard.
I think you overstate it. It is pretty common on Windows though.
--
Canadian Mind Products, Roedy Green.
http://mindprod.com Java custom programming, consulting and coaching.
Kaz Kylheku wrote:
> Bent C Dalager wrote:
> > In article <··············@david-steuber.com>,
> > David Steuber <·····@david-steuber.com> wrote:
> The 8 space tab is industry standard. Anything else is an abomination.
Perhaps it is some industry's standard but it is a huge waste of useful
space.
John Gagon
On Mon, 15 May 2006 02:44:54 GMT, Eli Gottlieb <···········@gmail.com>
wrote, quoted or indirectly quoted someone who said :
>Actually, spaces are better for indenting code.
Agreed. All it takes is one programmer to use a different tab
expansion convention to screw up a project. Spaces are unambiguous.
Ideally though you should run code through a beautifier before checkin
to avoid false deltas with people manually formatting code slightly
differently.
--
Canadian Mind Products, Roedy Green.
http://mindprod.com Java custom programming, consulting and coaching.
Xah Lee wrote:
> Tabs versus Spaces in Source Code
>
> Xah Lee, 2006-05-13
>
> In coding a computer program, there's often the choices of tabs or
> spaces for code indentation. There is a large amount of confusion about
> which is better. It has become what's known as “religious war” —
> a heated fight over trivia. In this essay, i like to explain what is
> the situation behind it, and which is proper.
>
Thanks Xah. I value your posts. Keep posting. And since your posts
usually cover broad areas of CS, keep crossposting. Don't go anywhere
Xah :-)
> Simply put, tabs is proper, and spaces are improper. Why? This may seem
> ridiculously simple given the de facto ball of confusion: the semantics
> of tabs is what indenting is about, while, using spaces to align code
> is a hack.
>
I wouldn't say that spaces are a hack, but tabs are superior.
> Now, tech geekers may object this simple conclusion because they itch
> to drivel about different editors and so on. The alleged problem
> created by tabs as seen by the industry coders are caused by two
> things: (1) tech geeker's sloppiness and lack of critical thinking
> which lead them to not understanding the semantic purposes of tab and
> space characters. (2) Due to the first reason, they have created and
> propagated a massive none-understanding and mis-use, to the degree that
> many tools (e.g. vi) does not deal with tabs well and using spaces to
> align code has become widely practiced, so that in the end spaces seem
> to be actually better by popularity and seeming simplicity.
>
Don't forget the laziness of programmers like me who don't put the
tabbing information in the source file. Vim deals with tabs well IMO,
but I almost never used to put the right auto-commands in the file to
get it set up right for other users.
> In short, this is a phenomenon of misunderstanding begetting a snowball
> of misunderstanding, such that it created a cultural milieu to embrace
> this malpractice and kick what is true or proper. Situations like this
> happens a lot in unix. For one non-unix example, is the file name's
> suffix known as “extension”, where the code of file's type became
> part of the file name. (e.g. “.txt”, “.html”, “.jpg”).
> Another well-known example is HTML practices in the industry, where
> badly designed tags from corporation's competitive greed, and stupid
> coding and misunderstanding by coders and their tools are so
> wide-spread such that they force the correct way to the side by the
> eventual standardization caused by sheer quantity of inproper but set
> practice.
>
> Now, tech geekers may still object, that using tabs requires the
> editors to set their positions, and plain files don't carry that
> information. This is a good question, and the solution is to advance
> the sciences such that your source code in some way embed such
> information.
Vim does this. We just have to use it.
> This would be progress. However, this is never thought of
> because the “unix philosophies” already conditioned people to hack
> and be shallow. In this case, many will simply use the character
> intended to separate words for the purpose of indentation or alignment,
> and spread the practice with militant drivels.
>
> Now, given the already messed up situation of the tabs vs spaces by the
> unixers and unix brain-washing of the coders in the industry... Which
> should we use today? I do not have a good proposition, other than just
> use whichever that works for you but put more critical thinking into
> things to prevent mishaps like this.
>
> Tabs vs Spaces can be thought of as parameters vs hard-coded values, or
> HTML vs ascii format, or XML/CSS vs HTML 4, or structural vs visual, or
> semantic vs format. In these, it is always easy to convert from the
> former to the latter, but near impossible from the latter to the
> former. And, that is because the former encodes information that is
> lost in the latter.
Nope. Conversion is relatively easy. I've written programs to do this
myself, and everyone and his brother has also done this. Virtually every
programmer's editor that I've ever used can do this, and a great, great
many independent programs convert tabs to spaces. It's like saying,
"it's near impossible to write a calculator program." :-)
I bet that someone has a Perl one-liner to do it.
On any Debian system, try a "man expand" and see what you find. Also,
emacs and vim do it. Perl has a Text::Tabs module. TCL's
::textutil::(un)?tabify routines do it. The birds do it, and the bees do
it. Oh wait, that's something else :-)
> If we look at the issue of tabs vs spaces, indeed,
> it is easy to convert tabs to spaces in a source code, but more
> difficult to convert from spaces to tabs.
Nope again. It's easy, you just keep track of the virtual character
position as you decide whether to write a space or a tab. Computers do
the "counting" thing fairly well.
> Because, tabs as indentation
> actually contains the semantic information about indentation. With
> spaces, this critical information is lost in space.
>
> This issue is intimately related to another issue in source code:
> soft-wrapped lines versus physical, hard-wrapped lines by EOL (end of
> line character). This issue has far more consequences than tabs vs
> spaces, and the unixer's unthinking has made far-reaching damages in
> the computing industry. Due to unix's EOL ways of thinking, it has
> created languages based on EOL (just about ALL languages except the
> Lisp family and Mathematica) and tools based on EOL (cvs, diff, grep,
> and basically every tool in unix), thoughts based on EOL (software
> value estimation by counting EOL, hard-coded email quoting system by
> “>” prefix, and silent line-truncations in many unix tools), such
> that any progress or development towards a “algorithmic code unit”
> concept or language syntaxes are suppressed. I have not written a full
> account on this issue, but i've touched it in this essay: “The Harm
> of hard-wrapping Lines”, at
> http://xahlee.org/UnixResource_dir/writ/hard-wrap.html
> ----
> This post is archived at:
> http://xahlee.org/UnixResource_dir/writ/tabs_vs_spaces.html
>
> Xah
> ···@xahlee.org
> ∑ http://xahlee.org/
>
I've never thought of tabs-vs-spaces as a religious war. Anyway, the
authority of the programming environment will determine which one is
used. Have a good week Xah.
Oh God, I agree with Xah Lee. Someone take me out behind the chemical
sheds...
Iain
Xah Lee wrote:
> Tabs versus Spaces in Source Code
>
> Xah Lee, 2006-05-13
>
> In coding a computer program, there's often the choices of tabs or
> spaces for code indentation. There is a large amount of confusion about
> which is better. It has become what's known as “religious war” —
> a heated fight over trivia. In this essay, i like to explain what is
> the situation behind it, and which is proper.
>
> Simply put, tabs is proper, and spaces are improper. Why? This may seem
> ridiculously simple given the de facto ball of confusion: the semantics
> of tabs is what indenting is about, while, using spaces to align code
> is a hack.
>
> Now, tech geekers may object this simple conclusion because they itch
> to drivel about different editors and so on. The alleged problem
> created by tabs as seen by the industry coders are caused by two
> things: (1) tech geeker's sloppiness and lack of critical thinking
> which lead them to not understanding the semantic purposes of tab and
> space characters. (2) Due to the first reason, they have created and
> propagated a massive none-understanding and mis-use, to the degree that
> many tools (e.g. vi) does not deal with tabs well and using spaces to
> align code has become widely practiced, so that in the end spaces seem
> to be actually better by popularity and seeming simplicity.
>
> In short, this is a phenomenon of misunderstanding begetting a snowball
> of misunderstanding, such that it created a cultural milieu to embrace
> this malpractice and kick what is true or proper. Situations like this
> happens a lot in unix. For one non-unix example, is the file name's
> suffix known as “extension”, where the code of file's type became
> part of the file name. (e.g. “.txt”, “.html”, “.jpg”).
> Another well-known example is HTML practices in the industry, where
> badly designed tags from corporation's competitive greed, and stupid
> coding and misunderstanding by coders and their tools are so
> wide-spread such that they force the correct way to the side by the
> eventual standardization caused by sheer quantity of inproper but set
> practice.
>
> Now, tech geekers may still object, that using tabs requires the
> editors to set their positions, and plain files don't carry that
> information. This is a good question, and the solution is to advance
> the sciences such that your source code in some way embed such
> information. This would be progress. However, this is never thought of
> because the “unix philosophies” already conditioned people to hack
> and be shallow. In this case, many will simply use the character
> intended to separate words for the purpose of indentation or alignment,
> and spread the practice with militant drivels.
>
> Now, given the already messed up situation of the tabs vs spaces by the
> unixers and unix brain-washing of the coders in the industry... Which
> should we use today? I do not have a good proposition, other than just
> use whichever that works for you but put more critical thinking into
> things to prevent mishaps like this.
>
> Tabs vs Spaces can be thought of as parameters vs hard-coded values, or
> HTML vs ascii format, or XML/CSS vs HTML 4, or structural vs visual, or
> semantic vs format. In these, it is always easy to convert from the
> former to the latter, but near impossible from the latter to the
> former. And, that is because the former encodes information that is
> lost in the latter. If we look at the issue of tabs vs spaces, indeed,
> it is easy to convert tabs to spaces in a source code, but more
> difficult to convert from spaces to tabs. Because, tabs as indentation
> actually contains the semantic information about indentation. With
> spaces, this critical information is lost in space.
>
> This issue is intimately related to another issue in source code:
> soft-wrapped lines versus physical, hard-wrapped lines by EOL (end of
> line character). This issue has far more consequences than tabs vs
> spaces, and the unixer's unthinking has made far-reaching damages in
> the computing industry. Due to unix's EOL ways of thinking, it has
> created languages based on EOL (just about ALL languages except the
> Lisp family and Mathematica) and tools based on EOL (cvs, diff, grep,
> and basically every tool in unix), thoughts based on EOL (software
> value estimation by counting EOL, hard-coded email quoting system by
> “>” prefix, and silent line-truncations in many unix tools), such
> that any progress or development towards a “algorithmic code unit”
> concept or language syntaxes are suppressed. I have not written a full
> account on this issue, but i've touched it in this essay: “The Harm
> of hard-wrapping Lines”, at
> http://xahlee.org/UnixResource_dir/writ/hard-wrap.html
> ----
> This post is archived at:
> http://xahlee.org/UnixResource_dir/writ/tabs_vs_spaces.html
>
> Xah
> ···@xahlee.org
> ∑ http://xahlee.org/
Iain King wrote:
> Oh God, I agree with Xah Lee. Someone take me out behind the chemical
> sheds...
>
> Xah Lee wrote:
<more worthless nonsense>
Please don't feed the troll!
And for the record, spaces are 100% portable, tabs are not. That ends
the argument for me.
Worse than either tabs or spaces however is Sun's mixture of the two.
--
Dale King
An old debate. My $0.02 :
http://numeromancer.dyndns.org/~timothy/tab-width-independence/description.html
The idea can be extended to other programming languages.
TS
> Simply put, tabs is proper, and spaces are improper.
> Why? This may seem
> ridiculously simple given the de facto ball of confusion: the semantics
> of tabs is what indenting is about, while, using spaces to align code
> is a hack.
The reality of programming practice trumps original intent of tab
characters. The tab character and space character are pliable in that
if their use changes their semantics change.
> ... and the solution is to advance
> the sciences such that your source code in some way
> embed such information.
If/when time comes where such info is embeded perhaps then tabs will be
OK.
---------------------------------------------------------------
I use spaces because of the many sources I've opened I have many times
sighed on opening tabed ones and never done so opening spaced ones.
I don't get mad, but sighing is a clear indicator of negativity.
Anyway, the more code I write and read the less indentation matters to
me. My brain can now parse akward source correctly far bettter than it
did a few years ago.
All the best,
Opalinski
······@gmail.com
http://www.geocities.com/opalpaweb/
·······@gmail.com opalinski from opalpaweb" <······@gmail.com> writes:
>> Simply put, tabs is proper, and spaces are improper.
>> Why? This may seem
>> ridiculously simple given the de facto ball of confusion: the semantics
>> of tabs is what indenting is about, while, using spaces to align code
>> is a hack.
>
> The reality of programming practice trumps original intent of tab
> characters. The tab character and space character are pliable in that
> if their use changes their semantics change.
>
>> ... and the solution is to advance
>> the sciences such that your source code in some way
>> embed such information.
>
> If/when time comes where such info is embeded perhaps then tabs will be
> OK.
>
> ---------------------------------------------------------------
>
> I use spaces because of the many sources I've opened I have many times
> sighed on opening tabed ones and never done so opening spaced ones.
>
> I don't get mad, but sighing is a clear indicator of negativity.
> Anyway, the more code I write and read the less indentation matters to
> me. My brain can now parse akward source correctly far bettter than it
> did a few years ago.
And anyways, C-x h C-M-\ comes automatically after C-x C-f source RET
Just add this to your ~/.emacs :
(add-hook 'find-file-hook
(lambda () (indent-region (point-min) (point-max)) (pop-mark)))
--
__Pascal Bourguignon__ http://www.informatimago.com/
IMPORTANT NOTICE TO PURCHASERS: The entire physical universe,
including this product, may one day collapse back into an
infinitesimally small space. Should another universe subsequently
re-emerge, the existence of this product in that universe cannot be
guaranteed.
······@gmail.com opalinski from opalpaweb wrote:
>>Simply put, tabs is proper, and spaces are improper.
>>Why? This may seem
>>ridiculously simple given the de facto ball of confusion: the semantics
>>of tabs is what indenting is about, while, using spaces to align code
>>is a hack.
>
>
> The reality of programming practice trumps original intent of tab
> characters. The tab character and space character are pliable in that
> if their use changes their semantics change.
[...]
Yes, as I started programming I also preferred tabs.
And with growing experience on how to handle this in true life
(different editors/systems/languages...) I saw, that
converting the "so fine tabs" was annoying.
The only thing that always worked were spaces.
Tab: nice idea but makes programming an annoyance.
Ciao,
Oliver
From: Edmond Dantes
Subject: Re: Tabs versus Spaces in Source Code
Date:
Message-ID: <1147905290_7@news-east.n>
Oliver Bandel wrote:
> ······@gmail.com opalinski from opalpaweb wrote:
...
> Yes, as I started programming I also preferred tabs.
> And with growing experience on how to handle this in true life
> (different editors/systems/languages...) I saw, that
> converting the "so fine tabs" was annoying.
>
> The only thing that always worked were spaces.
> Tab: nice idea but makes programming an annoyance.
>
> Ciao,
> Oliver
It all depends on your editor of choice. Emacs editing of Lisp (and a few
other languages, such as Python) makes the issue more or less moot. I
personally would recommend choosing one editor to use with all your
projects, and Emacs is wonderful in that it has been ported to just about
every platform imaginable.
The real issue is, of course, that ASCII is showing its age and we should
probably supplant it with something better. But I know that will never fly,
given the torrents of code, configuration files, and everything else in
ASCII. Even Unicode couldn't put a dent in it, despite the obvious growing
global development efforts. Not sure how many compilers would be able to
handle Unicode source anyway. I suspect the large majority of them would
would choke big time.
Oh well...
--
-- Edmond Dantes, CMC
And Now for something Completely Different:
http://gift-basket.prosperitysprinkler.com
http://sewing-machine.womencraft.com
http://coveralls.whiteboystuff.com
http://eyewear.blackboystuff.com
http://dinette.funiturenow.com
http://wheels.whiteboystuff.com
http://patio.funiturenow.com
Posted Via Usenet.com Premium Usenet Newsgroup Services
----------------------------------------------------------
** SPEED ** RETENTION ** COMPLETION ** ANONYMITY **
----------------------------------------------------------
http://www.usenet.com
Edmond Dantes <······@le-comte-de-monte-cristo.biz> writes:
> It all depends on your editor of choice. Emacs editing of Lisp (and a few
> other languages, such as Python) makes the issue more or less moot. I
> personally would recommend choosing one editor to use with all your
> projects, and Emacs is wonderful in that it has been ported to just about
> every platform imaginable.
>
> The real issue is, of course, that ASCII is showing its age and we should
> probably supplant it with something better. But I know that will never fly,
> given the torrents of code, configuration files, and everything else in
> ASCII. Even Unicode couldn't put a dent in it, despite the obvious growing
> global development efforts. Not sure how many compilers would be able to
> handle Unicode source anyway. I suspect the large majority of them would
> would choke big time.
All right unicode support is not 100% perfect already, but my main
compilers support it perfectly well, only 1/5 don't support it, and
1/5 support it partially:
------(unicode-script.lisp)---------------------------------------------
(defun clisp (file)
(ext:run-program "/usr/local/bin/clisp"
:arguments (list "-ansi" "-norc" "-on-error" "exit"
"-E" "utf-8"
"-i" file "-x" "(ext:quit)")
:input nil :output :terminal :wait t))
(defun gcl (file)
(ext:run-program "/usr/local/bin/gcl"
:arguments (list "-batch"
"-load" file "-eval" "(lisp:quit)")
:input nil :output :terminal :wait t))
(defun ecl (file)
(ext:run-program "/usr/local/bin/ecl"
:arguments (list "-norc"
"-load" file "-eval" "(si:quit)")
:input nil :output :terminal :wait t))
(defun sbcl (file)
(ext:run-program "/usr/local/bin/sbcl"
:arguments (list "--userinit" "/dev/null"
"--load" file "--eval" "(sb-ext:quit)")
:input nil :output :terminal :wait t))
(defun cmucl (file)
(ext:run-program "/usr/local/bin/cmucl"
:arguments (list "-noinit"
"-load" file "-eval" "(extensions:quit)")
:input nil :output :terminal :wait t))
(dolist (implementation '(clisp gcl ecl sbcl cmucl))
(sleep 3)
(terpri) (print implementation) (terpri)
(funcall implementation "unicode-source.lisp"))
------(unicode-source.lisp)---------------------------------------------
;; -*- coding: utf-8 -*-
(eval-when (:compile-toplevel :load-toplevel :execute)
(format t "~2%~A ~A~2%"
(lisp-implementation-type)
(lisp-implementation-version))
(finish-output))
(defun ιοτα (&key (номер 10) (단계 1) (בכוכ 0))
(loop :for i :from בכוכ :to номер :by 단계 :collect i))
(defun test ()
(format t "~%Calling ~S --> ~A~%"
'(ιοτα :номер 10 :단계 2 :בכוכ 2)
(ιοτα :номер 10 :단계 2 :בכוכ 2)))
(test)
------------------------------------------------------------------------
(load"unicode-script.lisp")
;; Loading file unicode-script.lisp ...
CLISP
i i i i i i i ooooo o ooooooo ooooo ooooo
I I I I I I I 8 8 8 8 8 o 8 8
I \ `+' / I 8 8 8 8 8 8
\ `-+-' / 8 8 8 ooooo 8oooo
`-__|__-' 8 8 8 8 8
| 8 o 8 8 o 8 8
------+------ ooooo 8oooooo ooo8ooo ooooo 8
Copyright (c) Bruno Haible, Michael Stoll 1992, 1993
Copyright (c) Bruno Haible, Marcus Daniels 1994-1997
Copyright (c) Bruno Haible, Pierpaolo Bernardi, Sam Steingold 1998
Copyright (c) Bruno Haible, Sam Steingold 1999-2000
Copyright (c) Sam Steingold, Bruno Haible 2001-2006
;; Loading file unicode-source.lisp ...
CLISP 2.38 (2006-01-24) (built 3347193361) (memory 3347193794)
Calling (ΙΟΤΑ :НОМЕР 10 :단계 2 :בכוכ 2) --> (2 4 6 8 10)
;; Loaded file unicode-source.lisp
Bye.
GCL
GNU Common Lisp (GCL) GCL 2.6.7
Calling (ιοτα :номер 10 :단계 2 :בכוכ 2) --> (2 4 6 8
10)
ECL
;;; Loading "unicode-source.lisp"
ECL 0.9g
Calling (ιοτα :номер 10 :단계 2 :בכוכ 2) --> (2 4 6 8 10)
SBCL
This is SBCL 0.9.12, an implementation of ANSI Common Lisp.
More information about SBCL is available at <http://www.sbcl.org/>.
SBCL is free software, provided as is, with absolutely no warranty.
It is mostly in the public domain; some portions are provided under
BSD-style licenses. See the CREDITS and COPYING files in the
distribution for more information.
SBCL 0.9.12
Calling (|ιοτα| :|номер| 10 :|ˋ¨ʳÂ| 2 :|בכוכ| 2) --> (2 4 6 8 10)
CMUCL
; Loading #P"/local/users/pjb/src/lisp/encours/unicode-source.lisp".
CMU Common Lisp 19c (19C)
Reader error at 214 on #<Stream for file "/local/users/pjb/src/lisp/encours/unicode-source.lisp">:
Undefined read-macro character #\Ã
[Condition of type READER-ERROR]
Restarts:
0: [CONTINUE] Return NIL from load of "unicode-source.lisp".
1: [ABORT ] Skip remaining initializations.
Debug (type H for help)
(LISP::%READER-ERROR
#<Stream for file "/local/users/pjb/src/lisp/encours/unicode-source.lisp">
"Undefined read-macro character ~S"
#\Ã)
Source: Error finding source:
Error in function DEBUG::GET-FILE-TOP-LEVEL-FORM: Source file no longer exists:
target:code/reader.lisp.
0] abort
*
Received EOF on *standard-input*, switching to *terminal-io*.
* (extensions:quit)
;; Loaded file unicode-script.lisp
T
[4]>
--
__Pascal Bourguignon__ http://www.informatimago.com/
Grace personified,
I leap into the window.
I meant to do that.
"Jonathon McKitrick" <···········@bigfoot.com> writes:
> Pascal Bourguignon wrote:
>> (defun ιοτα (&key (номер 10) (단계 1) (בכוכ 0))
>> (loop :for i :from בכוכ :to номер :by 단계 :collect i))
>
> How do you even *enter* these characters? My browser seems to trap all
> the special character combinations, and I *know* you don't mean
> selecting from a character palette.
Why? Of course!
Aren't you either an emacs or a Mac user?
On a Mac, you just select the input keyboad from the Input menu (the
little flag on the right of the menubar, you may activate it from the
International System Preference panel).
On emacs, it's as simple: M-x set-input-method RET
I've bound C-F9, C-F10, C-F11, and C-F12 to various input methods:
(global-set-key [C-f9] (lambda()(interactive)(set-input-method 'chinese-py-b5)))
(global-set-key [C-f10] (lambda()(interactive)(set-input-method 'cyrillic-yawerty)))
(global-set-key [C-f11] (lambda()(interactive)(set-input-method 'greek)))
(global-set-key [C-f12] (lambda()(interactive)(set-input-method 'hebrew)))
C-\ is bound to toggle-input-method which allows to revert back to the
usual input method.
For the alphabetic scripts, there's no difficulty, it's like with
roman scripts: each key is a character. For ideographic scripts, the
input methods are more sophisticated.
Then, you have to learn some of these strange languages. I learned
several (but I forgot everything but: לודג גד דג ינד, здраствуйте, я
люблю тибе, 我 聽龍, 我 不 中国人). For the Korean, I copy-and-pasted
it from some web translation service. But keying them in is the
easiest part.
--
__Pascal Bourguignon__ http://www.informatimago.com/
Cats meow out of angst
"Thumbs! If only we had thumbs!
We could break so much!"
Jonathon McKitrick wrote:
> Pascal Bourguignon wrote:
>
>>(defun ιοτα (&key (номер 10) (단계 1) (בכוכ 0))
>> (loop :for i :from בכוכ :to номер :by 단계 :collect i))
>
>
> How do you even *enter* these characters? My browser seems to trap all
> the special character combinations, and I *know* you don't mean
> selecting from a character palette.
Didn't you heard of that big keyboards?
12 meter x 2 meter wide I think.... you need a long
stick (maybe if you play golf, that can help).
The you have all UTF-8 characters there, that's fine,
but typing needs some time.
But it's good, because when ready with typing your email,
it's not necessary to go to sports after work. So your boss
can insist that you longer stay at work.
Ciao,
Oliver
;-)
"Jonathon McKitrick" <···········@bigfoot.com> wrote in message
···························@j33g2000cwa.googlegroups.com...
> Pascal Bourguignon wrote:
>> (defun ιοτα (&key (номер 10) (단계 1) (בכוכ 0))
>> (loop :for i :from בכוכ :to номер :by 단계 :collect i))
>
> How do you even *enter* these characters? My browser seems to trap all
> the special character combinations, and I *know* you don't mean
> selecting from a character palette.
>
> hey, this is weird...
>
> î
>
> I've got something happening, but I can't tell what.
>
> Yes, I'm an ignorant Western world ASCII user. :-)
What OS are you using? In Windows XP, you'd have to let the XP know that
you're interested in input in languages other than English via "Control
Panel -> Regional Settings -> Languages -> Text Services and Input
Languages". There, you'd add input methods other than English. Each "input
method" works in a sort of unique way, so you'll just have to learn them.
For example, under English, you can use the "keyboard" input method which
probably is what you're using now, or the "handwriting recognition" input
method, or the "speech recognition" input method to insert english text.
There are other input methods for the Asian languages (e.g. Chinese,
Japanese, etc.)
- Oliver
Xah Lee wrote:
> Tabs vs Spaces can be thought of as parameters vs hard-coded values, or
> HTML vs ascii format, or XML/CSS vs HTML 4, or structural vs visual, or
> semantic vs format. In these, it is always easy to convert from the
> former to the latter, but near impossible from the latter to the
> former.
Bahaha, looks like someone hasn't thought things through very well.
Spaces, under a mono font, offer greater precision and expressivity in
achieving specific alignment. That expressivity cannot be captured by
tabs.
The difficulty in converting spaces to tabs rests not in any bridgeable
semantic gap, but in the lack of having any way whatsoever to express
using tabs what the spaces are expressing.
It's not /near/ impossible, it's /precisely/ impossible.
For instance, tabs cannot express these alignments:
/*
* C block
* comment
* in a common style.
*/
(lisp
(nested list
with symbols
and things))
(call to a function
with many parameters)
;; how do you align "to" and "with" using tabs?
;; only if "to" lands on a tab stop; but dependence on specific tab
stops
;; destroys the whole idea of tabs being parameters.
To do these alignments structurally, you need something more expressive
than spaces or tabs. But spaces do the job under a mono font, /and/
they do it in a completely unobtrusive way.
If you want to do nice typesetting of code, you have to add markup
which has to be stripped away if you actually want to run the code.
Spaces give you decent formatting without markup. Tabs do not. Tabs are
only suitable for aligning the first non-whitespace character of a line
to a stop. Only if that is the full extent of the formatting that you
need to express in your code can you acheive the ideal of being able to
change your tab parameter to change the indentation amount. If you need
to align characters which aren't the first non-whitespace in a line,
tabs are of no use whatsoever, and proportional fonts must be banished.
Kaz Kylheku wrote:
> If you want to do nice typesetting of code, you have to add markup
> which has to be stripped away if you actually want to run the code.
Typesetting code is not a helpful activity outside of the publishing
industry. You might like the results of your typsetting; I happen not
to. You probably wouldn't like mine. Does that mean we shouldn't work
together? Only if you insist on forcing me to conform to your way of
displaying code.
You are correct in pointing out that tabs don't allow for 'alignment'
of the sort you mention:
(lisp
(nested list
with symbols
and things))
But then neither does Python. I happen to think that's a feature.
(And of course you can do what you like inside a comment. That's
because tabs are for indentation, and indentation is meanigless in that
context. Spaces are exactly what you should use then. I may or may not
like your layout, but it won't break anything when we merge our code.)
achates wrote:
> Kaz Kylheku wrote:
>
> > If you want to do nice typesetting of code, you have to add markup
> > which has to be stripped away if you actually want to run the code.
>
> Typesetting code is not a helpful activity outside of the publishing
> industry.
Be that as it may, code writing involves an element of typesetting. If
you are aligning characters, you are typesetting, however crudely.
> You might like the results of your typsetting; I happen not
> to. You probably wouldn't like mine. Does that mean we shouldn't work
> together? Only if you insist on forcing me to conform to your way of
> displaying code.
Someone who insists that everyone should separate line indentation into
tabs which achieve the block level, and spaces that achieve additional
alignment, so that code could be displayed in more than one way based
on the tab size without loss of alignment, is probably a "space cadet",
who has a bizarre agenda unrelated to developing the product.
There is close to zero value in maintaining such a scheme, and
consequently, it's hard to justify with a business case.
Yes, in the real world, you have to conform to someone's way of
formatting and displaying code. That's how it is.
You have to learn to read, write and even like more than one style.
> You are correct in pointing out that tabs don't allow for 'alignment'
> of the sort you mention:
That alignment has a name: hanging indentation.
All forms of aligning the first character of a line to some requirement
inherited from the previous line are called indentation.
Granted, a portion of that indentation is derived from the nesting
level of some logically enclosing programming language construct, and
part of it may be derived from the position of a character of some
parallel constituent within the construct.
> (lisp
> (nested list
> with symbols
> and things))
> But then neither does Python. I happen to think that's a feature.
Python has logical line continuation which gives rise to the need for
hanging indents to line up with parallel constituents in a folded
expression.
Python also allows for the possibility of statements separated by
semicolons on one line, which may need to be lined up in columns.
var = 42; foo = 53
x = 2; y = 10
> (And of course you can do what you like inside a comment. That's
> because tabs are for indentation, and indentation is meanigless in that
> context.
A comment can contain example code, which contains indentation.
What, I can't change the tab size to display that how I want? Waaah!!!
(;_;)
Xah Lee wrote:
> Tabs versus Spaces in Source Code
>
> Xah Lee, 2006-05-13
>
> In coding a computer program, there's often the choices of tabs or
> spaces for code indentation. There is a large amount of confusion about
> which is better. It has become what's known as “religious war” —
> a heated fight over trivia. In this essay, i like to explain what is
> the situation behind it, and which is proper.
>
> Simply put, tabs is proper, and spaces are improper.
[...]
I fullheartedly disagree :)
So, no "essay" on this is necessary to read :->
Ciao,
Oliver
Xah Lee wrote:
> Tabs versus Spaces in Source Code
>
> Xah Lee, 2006-05-13
>
> In coding a computer program, there's often the choices of tabs or
> spaces for code indentation.
<snip>
> (2) Due to the first reason, they have created and
> propagated a massive none-understanding and mis-use, to the degree that
> many tools (e.g. vi) does not deal with tabs well
:set ts=<n>
Yeah, that's really tough. vi does just fine handling tabs. vim does
an even better job, with mode-lines, = and :retab.
In my experience, the people who complain about the use
of tabs for indentation are the people who don't know
how to use their editor, and those people tend to use
emacs.
"Bill Pursell" <············@gmail.com> writes:
> In my experience, the people who complain about the use
> of tabs for indentation are the people who don't know
> how to use their editor, and those people tend to use
> emacs.
HA HA HA HA HA HA HA HA HA HA HA HA ....
Tee, hee heee.... snif!
Phew. Better now.
That was funny! Thanks! :-)
If I work on your project, I follow the coding and style standards you
specify.
Likewise if you work on my project you follow the established
standards.
Fortunately for you, I am fairly liberal on such matters.
I like to see 4 spaces for indentation. If you use tabs, that's what I
will see, and you're very likely to have your code reformatted by the
automated build process, when the standard copyright header is pasted
and missing javadoc tags are generated as warnings.
I like the open brace to start on the line of the control keyword. I
can deal with the open brace being on the next line, at the same level
of indentation as the control keyword. I don't quite understand the
motivation behind the GNU style, where the brace itself is treated as a
half-indent, but I can live with it on *your* project.
Any whitespace or other style that isn't happy to be reformatted
automatically is an error anyway.
I'd be very laissez-faire about it except for the fact that code
repositories are much easier to manage if everything is formatted
before
it goes in, or as a compromise, as a step at release tags.
Ashesh..
the following are 2 FAQ following this thread. Thanks.
Addendum: 2006-05-15
Q: What you mean by embeding tab position info into the source code?
How's that gonna be done?
A: Tech geekers may not realize, but such embedding of meta info do
exist in many technologies by various means because of a need. For
example, Mac OS Classic's resource fork and Mac OS X's bundling system,
unix shell script's shebang (#!), emacs and Python's encoding
declaration “#-*- coding: utf-8 -*-”, Unicode's BOM, CVS's
change-log insertion, Mathematica's source code system the Notebook,
Microsoft Word's transparent meta data, as well as HTML and XML's
various declarations embedded in the file. Some of these systems are
good designs and some are hacks.
Somehow tech geekers have the sense that “source code” must be a
plain text file containing nothing else but the programing code. This
may be a defendable position, but as we can see in the above examples,
this idea is primitive and does not address the various needs. If the
tech geekers have thought out about these issues, computing languages
and its source code may have developed into more powerful and flexible
integrated systems as the above standardized examples. For instance,
many commercial development systems actually already have such
meta-data embodied with the source code. (e.g. Borland Delphi,
Metrowerks's CodeWarrior, Microsoft Visual Studio, Wolfram Research's
Mathematica.) Some of which, not only embody development-related info
such as debug points or linking files, but also allow programers to
high-light code for visual purposes like a word processor, or even
display them visually as type-set mathematics.
Q: Converting spaces to tabs is actually easy. I don't see how spacess
lose info.
A: Here is a illustration on how it is not possible to convert spaces
to tabs. Suppose you are writing in a language where the indentation is
part of the semantics, not just for appearance. Now, suppose you have
these two lines:
1234567890
A
B
The first line has 2 space prefix and second line has 4 space prefix.
How, if you convert this to tabs, how do you know that's 1 and 2 tabs,
or 2 and 4 tabs? In essence, there is no way to tell how many tabs n
represents, where n is the smallest space prefix in the code, unless n
== 1.
The above demonstrates the information loss in using spaces for
indentation in a theoretical way. There are also practical problems. In
practice, many languages allow string literals like this myName="i love
you", and strings easily can have a run of spaces. One cannot simply
run a blind find-n-replace operation to replace all spaces to tabs. But
also, many unix languages contains a so-called construct of
“heredoc” as a mean to embed a literal block of text. For example,
here's a PHP construct of heredoc:
$novelText = <<<arbitraryCharsHereAsDelimiter
(__)
(oo)
/-------\/
/ | ||
* ||----||
~~ ~~
arbitraryCharsHereAsDelimiter;
}
Regardless of its design as a language construct, the purpose of
“heredoc” is that it allows programers to easily embed a text (a
large string), without worrying about the text containing sequence of
characters that may be meaningful to the language. If a language has
heredoc construct, then it is basically impossible to convert from
spaces to tabs, as that will botch literal string embedded in heredoc.
However, it is less of a problem to convert tabs to spaces, because the
frequency of spaces appearing in literal strings are far higher than
literal tabs.
Another practical issue is error recovery. Suppose, one uses 4 spaces
for a indentation. Now, it is not uncommon to see lines with odd number
of space prefixes such as 7 or 10 out of common sloppiness. Such error
would happen more often if spaces are used for indentation, and the
essence is that tabs enforce a semantic association and is impossible
to make a half-indentation.
Q: Well, i just like spaces because they are most compatible.
A: Sure, crass simplicity is always more compatible. Suppose a unixer
will say, he doesn't like HTML because it is fret with problems and
incompatibilities. He'd rather prefer plain text. And, indeed, a lot
unixers seriously think that.
---------------------------
PS in the answer to the first question, i gave the following examples
of IDE/Language that actually embed formatting info in the source code:
Borland Delphi, Metrowerks's CodeWarrior, Microsoft Visual Studio,
Wolfram Research's Mathematica
actually, i know Mathematica does, but i'm not quite sure about the
other examples. So, my question is, does any one knows a language or
IDE that actually allows the coder to manually highlight parts of the
code and this highlight stick with the file upon reopening, as if a
word processor?
Xah
···@xahlee.org
∑ http://xahlee.org/
Xah Lee wrote:
> Tabs versus Spaces in Source Code
> This post is archived at:
> http://xahlee.org/UnixResource_dir/writ/tabs_vs_spaces.html
Xah Lee wrote:
> the following are 2 FAQ following this thread. Thanks.
>
> Addendum: 2006-05-15
>
> Q: What you mean by embeding tab position info into the source code?
> How's that gonna be done?
>
> A: Tech geekers may not realize, but such embedding of meta info do
> exist in many technologies by various means because of a need. For
> example, Mac OS Classic's resource fork and Mac OS X's bundling system,
> unix shell script's shebang (#!), emacs and Python's encoding
> declaration “#-*- coding: utf-8 -*-”, Unicode's BOM, CVS's
> change-log insertion, Mathematica's source code system the Notebook,
> Microsoft Word's transparent meta data, as well as HTML and XML's
> various declarations embedded in the file. Some of these systems are
> good designs and some are hacks.
>
Vim's mode-lines do this too.
> Somehow tech geekers have the sense that “source code” must be a
> plain text file containing nothing else but the programing code. This
> may be a defendable position, but as we can see in the above examples,
> this idea is primitive and does not address the various needs. If the
> tech geekers have thought out about these issues, computing languages
> and its source code may have developed into more powerful and flexible
> integrated systems as the above standardized examples.
The tech geekers have thought about it. Donald Knuth invented TeX, and
went on to invent the WEB literate programming system. You don't get any
geekier than that :)
> For instance,
> many commercial development systems actually already have such
> meta-data embodied with the source code. (e.g. Borland Delphi,
> Metrowerks's CodeWarrior, Microsoft Visual Studio, Wolfram Research's
> Mathematica.) Some of which, not only embody development-related info
> such as debug points or linking files, but also allow programers to
> high-light code for visual purposes like a word processor, or even
> display them visually as type-set mathematics.
>
> Q: Converting spaces to tabs is actually easy. I don't see how spacess
> lose info.
>
> A: Here is a illustration on how it is not possible to convert spaces
> to tabs. Suppose you are writing in a language where the indentation is
> part of the semantics, not just for appearance. Now, suppose you have
> these two lines:
I'd say that such a language removes the choice of whether to use tabs
or spaces, and the discussion is over when you don't have a choice.
>
> 1234567890
> A
> B
>
> The first line has 2 space prefix and second line has 4 space prefix.
> How, if you convert this to tabs, how do you know that's 1 and 2 tabs,
> or 2 and 4 tabs? In essence, there is no way to tell how many tabs n
> represents, where n is the smallest space prefix in the code, unless n
> == 1.
vim: tabstop=4
The argument for spaces over tabs says that you have to include some
metadata in order for the document to look right on other people's
computers if you use tabs. This example, plus my example mode-line for
vim, reinforces that idea IMO.
>
> The above demonstrates the information loss in using spaces for
> indentation in a theoretical way. There are also practical problems. In
> practice, many languages allow string literals like this myName="i love
> you", and strings easily can have a run of spaces. One cannot simply
> run a blind find-n-replace operation to replace all spaces to tabs. But
> also, many unix languages contains a so-called construct of
> “heredoc” as a mean to embed a literal block of text. For example,
> here's a PHP construct of heredoc:
>
> $novelText = <<<arbitraryCharsHereAsDelimiter
> (__)
> (oo)
> /-------\/
> / | ||
> * ||----||
> ~~ ~~
> arbitraryCharsHereAsDelimiter;
> }
>
Yes, there are lots of situations like this where you can't just
willy-nilly convert between tabs and spaces. But even in this case shows
that, if you use consistent tab widths, the text has a chance of
surviving. I converted your little doggie to and from text with tab
sizes of eight, and he survived. (I did it with tabs set to four too,
and it worked.)
> Regardless of its design as a language construct, the purpose of
> “heredoc” is that it allows programers to easily embed a text (a
> large string), without worrying about the text containing sequence of
> characters that may be meaningful to the language. If a language has
> heredoc construct, then it is basically impossible to convert from
> spaces to tabs, as that will botch literal string embedded in heredoc.
Yes it would. Upon printing, if the terminal tab width was set to eight,
but the text conversion was done with tabs at four, bye bye doggie.
> However, it is less of a problem to convert tabs to spaces, because the
> frequency of spaces appearing in literal strings are far higher than
> literal tabs.
>
> Another practical issue is error recovery. Suppose, one uses 4 spaces
> for a indentation. Now, it is not uncommon to see lines with odd number
> of space prefixes such as 7 or 10 out of common sloppiness. Such error
> would happen more often if spaces are used for indentation, and the
> essence is that tabs enforce a semantic association and is impossible
> to make a half-indentation.
>
What I've learned is that, if I'm going to use tabs for indentation, I
have to be consistent.
> Q: Well, i just like spaces because they are most compatible.
>
> A: Sure, crass simplicity is always more compatible. Suppose a unixer
> will say, he doesn't like HTML because it is fret with problems and
> incompatibilities. He'd rather prefer plain text. And, indeed, a lot
> unixers seriously think that.
>
> ---------------------------
> PS in the answer to the first question, i gave the following examples
> of IDE/Language that actually embed formatting info in the source code:
> Borland Delphi, Metrowerks's CodeWarrior, Microsoft Visual Studio,
> Wolfram Research's Mathematica
>
Perl's POD and Java's javadoc do it too.
> actually, i know Mathematica does, but i'm not quite sure about the
> other examples. So, my question is, does any one knows a language or
> IDE that actually allows the coder to manually highlight parts of the
> code and this highlight stick with the file upon reopening, as if a
> word processor?
>
> Xah
> ···@xahlee.org
> ∑ http://xahlee.org/
>
> Xah Lee wrote:
>> Tabs versus Spaces in Source Code
>> This post is archived at:
>> http://xahlee.org/UnixResource_dir/writ/tabs_vs_spaces.html
>
I'm slowly moving into the "spaces" camp. After reading your earlier
post on tabs vs. spaces and other people's responses, I began thinking
about why I like tabs so much, and there is only one answer--backspace.
If I use tabs, when I backspace I go back to the previous tab position,
which is what I want. With spaces, I have to hit the backspace key
several times to get back. That's it--one feature is the only reason I
like tabs, so I decided to investigate vim's features to see if vim
would let me backspace to the previous tab position with one keystroke.
'Softtabstop' (sts) is the feature. I would have never thought to look
for this feature without your post. Thanks again Xah.
Your posts are on topic, informative, engaging and necessary. Keep them
coming Xah. :)