Data cleaning without programming experience?

darrens Peon

Messages:: 808

Likes Received:: 4

Best Answers:: 0

Trophy Points:: 0

#1

Guys,

I have a csv file with 700+ records. Each record has a about 20 fields.
In 3/4 of the fields i have text that contains some basic html like <br /><b><p> etc ...
Within this text i have just noticed there is lots of umlauts (special characters) ...
I need to clean this content and replace the umlauts with the correct numeric value ...
For example:
¬ = Â¬

I have no programming experience and use notepad++ to edit this text?
I think there is just over 100 umlauts?

To save myself a really boring job i need to find a way to 'find and replace' automatically?
Any suggestions?

Thanks

darrens, Jun 7, 2012 IP

spids Active Member

Messages:: 222

Likes Received:: 5

Best Answers:: 0

Trophy Points:: 58

#2

Use notepad++ it's greeat for editing files.

search/find in files

There you can enter anything and replace it with anything. Keep in mind you can also replace stuff with a "space"

spids, Jun 8, 2012 IP

Fortix Servers Peon

Messages:: 40

Likes Received:: 0

Best Answers:: 0

Trophy Points:: 0

#3

Actually this is quite easy to code.
You just need to make a look that search for a specific code and than replace it.

Fortix Servers, Jun 10, 2012 IP

Rukbat Well-Known Member

Messages:: 2,908

Likes Received:: 37

Best Answers:: 51

Trophy Points:: 125

#4

If there are 100 different special codes (an umlaut is the 2 dots over a vowel used in some languages), you'll have to do 100 global search and replace operations if you want to do it without programming. Notepad++ is fine to use for this. (You could open the file in a web browser. If the page shows up right, highlight the whole page, copy and paste it into Notepad (which ignores almost everything but pure ASCII).

Rukbat, Jun 16, 2012 IP

Ev0Lv Greenhorn

Messages:: 43

Likes Received:: 0

Best Answers:: 0

Trophy Points:: 16

#5

You can do it in any Text Editor i guess. With Search and Replace feature.

Ev0Lv, Jun 17, 2012 IP

Log in or Sign up

Data cleaning without programming experience?

darrens Peon

spids Active Member

Fortix Servers Peon

Rukbat Well-Known Member

Ev0Lv Greenhorn

Useful Searches