[ACCEPTED]-How to print lines that only contain characters from a list in BASH?-grep
grep '^[eat]*$' dictionary.txt
Explanation:
^
= marker meaning beginning 3 of line
$
= marker meaning end of line
[abc]
= character 2 class ("match any one of these characters")
*
= multiplier 1 for character class (zero or more repetitions)
Unfortunately, I cannot comment, otherwise 7 I'd add to amphetamachine's answer. Anyway, with the updated condition 6 of thousands of search characters you may 5 want to do the following:
grep -f patterns.txt dictionary.txt
where patterns.txt
is your 4 regexp:
/^[eat]\+$/
Below is a sample session:
$ cat << EOF > dictionary.txt
> one
> two
> cat
> eat
> four
> tea
> five
> cheat
> EOF
$ cat << EOF > patterns.txt
> ^[eat]\+$
> EOF
$ grep -f patterns.txt dictionary.txt
eat
tea
$
This way 3 you are not limited by the shell (Argument 2 list too long). Also, you can specify multiple 1 patterns in the file:
$ cat patterns.txt
^[eat]\+$
^five$
$ grep -f patterns.txt dictionary.txt
eat
tea
five
$
Try it using awk
:
awk '/^[eat]*$/ { print }' dictionary.txt
I found this to be at least 9 an order of magnitude faster than grep for 8 more than about 7 letters. However, I don't 7 know if you will run into the same problem 6 with thousands of letters, as I didn't test 5 that many.
You can even search for multiple 4 patterns at once (this is faster than searching 3 each pattern one at a time, since the dictionary 2 file will be read only once). Every pattern 1 acts as an if statement:
awk '/^[eat]*$/ { print "[eat]: " $0 } /^[cat]*$/ { print "[cat]: " $0 }' dictionary.txt
sed -n '/a/'p words.txt
Use this for whichever letter you need to 8 find. If you want to find more than one 7 letter together, simply repeat the command.
Grep 6 also should not be used for more than the 5 most simple/elementary of searches, IMHO. Although 4 I normally hesitate to call any of the POSIX 3 utilities obsolete, I do try and avoid grep. Its' syntax 2 is extremely inconsistent.
Studying this 1 text file is also recommended. http://sed.sourceforge.net/sed1line.txt
If you want to include e.g. Umlauts in the 4 pattern and not want to have the other accents, set 3 the LC_ALL="C"
prior to executing the grep.
This 2 e.g. will give you only the candidate German 1 words in a potential dictionary.txt file.
LC_ALL="C" grep '^[a-zA-ZäÄöÖüÜß]*$' dictionary.txt
More Related questions
We use cookies to improve the performance of the site. By staying on our site, you agree to the terms of use of cookies.