$ ls
links  published  categories  alts  content

$ cat links
- Home
- Author

$ cat published


$ cat categories
- uncategorized

$ cat alts
- Gopher Mirror
- text/plain; width=40
- text/plain; width=72
- text/plain; width=120
- application/x-troff

$ cat content

                          Google Sheets Oddity
-------------------------------------------------------------------------

I like to use Google Sheets for some quick chart/graph creation.  This
often means that I end up creating scripts that produce CSV content and
then paste it into a sheet and finally use that sheet to create the
chart I want.

Today, I wanted to make a chart comparing the number of lines in each
numbered file in a directory.  I ended up doing this with some simple
bash/awk scripting, but I ran into a confusing problem that makes sense
in hindsight.

The TL;DR is that when using this workflow, it's important to mark the
column as Plain Text if you're getting spurious commas thrown in.

In my case, the directory I was working with looked like:

--------8<--------------------------------------------------------------
$ ls rc_*
rc_1000 rc_1008 rc_1016 rc_1024 rc_1064 rc_1072 rc_1080 rc_1088
rc_1001 rc_1009 rc_1017 rc_1025 rc_1065 rc_1073 rc_1081 rc_1089
rc_1002 rc_1010 rc_1018 rc_1026 rc_1066 rc_1074 rc_1082 rc_1090
rc_1003 rc_1011 rc_1019 rc_1027 rc_1067 rc_1075 rc_1083 rc_1091
-------->8--------------------------------------------------------------

Getting the word counts of each of these is easy.

--------8<--------------------------------------------------------------
$ wc -l rc_*
    1 rc_1000
    1 rc_1001
    1 rc_1002
    1 rc_1003
    1 rc_1008
    1 rc_1009
    1 rc_1010
    1 rc_1011
    1 rc_1016
    1 rc_1017
[... snip ... ]
   38 rc_1075
    1 rc_1080
  161 rc_1081
    1 rc_1082
   17 rc_1083
   12 rc_1088
  192 rc_1089
   51 rc_1090
   26 rc_1091
  728 total
-------->8--------------------------------------------------------------

And converting this to CSV is pretty easy too, with some Awk magic.

--------8<--------------------------------------------------------------
$ !! | awk 'BEGIN { OFS="," } !/total/ { gsub("rc_", ""); print int($2), $1 }'
1000,1
1001,1
1002,1
1003,1
1008,1
1009,1
1010,1
1011,1
1016,1
1017,1
[... snip ...]
1075,38
1080,1
1081,161
1082,1
1083,17
1088,12
1089,192
1090,51
1091,26
-------->8--------------------------------------------------------------

Then it's just a matter of copy-pasting this into Google Sheets, which
is easier with the xsel command.

--------8<--------------------------------------------------------------
$ !! | xsel -i --clipboard
-------->8--------------------------------------------------------------

Now, the crux of the issue: when I paste this in, I normally expect to
see a dialog that asks if I want to split by comma.  This time, I had
accidentally set the column to be a number (Select Column A, Format >
Number > Number), and when I paste in the data from above, Sheets "help-
fully" inserted extra commas for me.  This is mostly caused by the
"1081,161" line, and anything with a comma that looks like it's separat-
ing the thousand from the hundreds.

The fix is non-obvious if you don't realize the Format as Number prob-
lem, but it is: select the column again and change the formatting to
Plain Text.  Now when you paste, Sheets won't insert extra commas.