amuck-landowner

Shell script IP extractor

drmike

100% Tier-1 Gogent
So I have a pile of files and within those are various IPs (among all sorts of other data)

Goal is to extract those IPs into a new single IP-only file from all the various files.

Someone have a shell script magic to accomplish such a thing?
 

Aldryic C'boas

The Pony
As far as quick and dirty, grep -Eo '[0-9]+\.[0-9]+\.[0-9]+\.[0-9]+' will work fine.

-06:27:12- Wren:~ :: aldryic % cat iptest
1.2.3.4 no this does not match 3.4.5.6
4.5.6.7 8.9.10.15 ohgoditburns 43.25.11.5
51.13.45.31 and this last one is invalid 301.405.551.987

-06:27:14- Wren:~ :: aldryic % grep -Eo '[0-9]+\.[0-9]+\.[0-9]+\.[0-9]+' iptest
1.2.3.4
3.4.5.6
4.5.6.7
8.9.10.15
43.25.11.5
51.13.45.31
301.405.551.987
Now, the downside is that the simple regex will catch 'invalid' IPs too, like that last one.  But unless you're expecting invalid IPs to be in your files that shouldn't be a problem.

EDIT: Instead of grep [...] filename, you will likely want to do grep -Eo '[0-9]+\.[0-9]+\.[0-9]+\.[0-9]+' * > report.txt to run the regex through all files in that directory, and dump the results into a .txt file for you.
 
Last edited by a moderator:

5n1p

New Member
With python:

Code:
import re
print re.findall(r'[0-9]+(?:\.[0-9]+){3}', open('filenamehere.txt').read())
 

drmike

100% Tier-1 Gogent
They are mainly logs .... Some files are self created from other data.   IPs should all be valid, although I could see some logging timestamps getting confused as as IPs... although only 2 fields with period dividing those.
 

drmike

100% Tier-1 Gogent
This --->

 grep -Eo '[0-9]+\.[0-9]+\.[0-9]+\.[0-9]+'

That works great and fast... Now to deal with the duplicates in the output.

Thank you!
 

drmike

100% Tier-1 Gogent
Perfected!

clear && date && cat *.log.* | grep -Eo '[0-9]+\.[0-9]+\.[0-9]+\.[0-9]+' | sort | uniq > end.txt && date

Some false entries in there due to matching, but will clean those up otherwise with some SQL logic post import.
 
Top
amuck-landowner