[ACCEPTED]-How to print lines that only contain characters from a list in BASH?-grep

Accepted answer
Score: 18
grep '^[eat]*$' dictionary.txt

Explanation:

^ = marker meaning beginning 3 of line

$ = marker meaning end of line

[abc] = character 2 class ("match any one of these characters")

* = multiplier 1 for character class (zero or more repetitions)

Score: 9

Unfortunately, I cannot comment, otherwise 7 I'd add to amphetamachine's answer. Anyway, with the updated condition 6 of thousands of search characters you may 5 want to do the following:

grep -f patterns.txt dictionary.txt

where patterns.txt is your 4 regexp:

/^[eat]\+$/

Below is a sample session:

$ cat << EOF > dictionary.txt
> one
> two
> cat
> eat
> four
> tea
> five
> cheat
> EOF
$ cat << EOF > patterns.txt
> ^[eat]\+$
> EOF
$ grep -f patterns.txt dictionary.txt
eat
tea
$

This way 3 you are not limited by the shell (Argument 2 list too long). Also, you can specify multiple 1 patterns in the file:

$ cat patterns.txt
^[eat]\+$
^five$
$ grep -f patterns.txt dictionary.txt
eat
tea
five
$
Score: 6

Try it using awk:

awk '/^[eat]*$/ { print }' dictionary.txt

I found this to be at least 9 an order of magnitude faster than grep for 8 more than about 7 letters. However, I don't 7 know if you will run into the same problem 6 with thousands of letters, as I didn't test 5 that many.

You can even search for multiple 4 patterns at once (this is faster than searching 3 each pattern one at a time, since the dictionary 2 file will be read only once). Every pattern 1 acts as an if statement:

awk '/^[eat]*$/ { print "[eat]: " $0 } /^[cat]*$/ { print "[cat]: " $0 }' dictionary.txt
Score: 5
sed -n '/a/'p words.txt

Use this for whichever letter you need to 8 find. If you want to find more than one 7 letter together, simply repeat the command.

Grep 6 also should not be used for more than the 5 most simple/elementary of searches, IMHO. Although 4 I normally hesitate to call any of the POSIX 3 utilities obsolete, I do try and avoid grep. Its' syntax 2 is extremely inconsistent.

Studying this 1 text file is also recommended. http://sed.sourceforge.net/sed1line.txt

Score: 1

If you want to include e.g. Umlauts in the 4 pattern and not want to have the other accents, set 3 the LC_ALL="C" prior to executing the grep.

This 2 e.g. will give you only the candidate German 1 words in a potential dictionary.txt file.

LC_ALL="C" grep '^[a-zA-ZäÄöÖüÜß]*$' dictionary.txt

More Related questions