Home > Technology > One line awk to find all words with all vowels

One line awk to find all words with all vowels


Here is a one line awk command to find all the words which contains all the vowels in a given file.

awk '{c=split($0, s); for(n=1; n<=c; ++n) print s[n] }' $1 | awk 'match ($0, /[e]/)' |awk 'match ($0, /[i]/)' | awk 'match ($0, /[a]/)'| awk 'match ($0, /[o]/)' |awk 'match ($0, /[u]/)'

if you have a file with the following contents

education
reduction
allusions
documentation

The above command prints
education
documentation

In the above command we are using awk to extract the word and finding whether the given word contains a, e, i, o, u. If any word which contains all the vowels gets printed in the output.

Here is the command that checks all the .txt files recursively and checks for all the words with a, e, i, o, u.

find -name "*.txt" |xargs awk '{c=split($0, s); for(n=1; n<=c; ++n) print s[n] }'| awk 'match ($0, /[e]/)' |awk 'match ($0, /[i]/)' | awk 'match ($0, /[a]/)'| awk 'match ($0, /[o]/)' |awk 'match ($0, /[u]/)'|sort|uniq -i|awk '!match ($0, /[_,\-\.\/\|\@\:\]\[\)\(\*\&\=\*\#\!\{\}\"\?]/)'

A quick observation from the above script is that the last regular expression (awk ‘!match ($0, /[_,\-\.\/\|\@\:\]\[\)\(\*\&\=\*\#\!\{\}\”\?]/)’) is to remove all the words which contain the special characters. But if could eliminate these characters much before they even come about we could significantly improve the performance.

find -name "*.txt" |xargs awk '{c=split($0, s); for(n=1; n<=c; ++n) print s[n] }'| awk '!match ($0, /[_,\-\.\/\|\@\:\]\[\)\(\*\&\=\*\#\!\{\}\"\?]/)'| awk 'match ($0, /[e]/)' |awk 'match ($0, /[i]/)' | awk 'match ($0, /[a]/)'| awk 'match ($0, /[o]/)' |awk 'match ($0, /[u]/)'|sort|uniq -i

Ran the above two commands with “time” command and results are as follows

Without optimization:

real 0m0.824s
user 0m1.508s
sys 0m0.076s

With optimization the last command

real 0m0.784s
user 0m1.868s
sys 0m0.064s

On a medium sized input directory itself we could clearly find the performance improvement.

Advertisements
  1. No comments yet.
  1. No trackbacks yet.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: