Count Popular Words [Tag Cloud-ish]

timallard Well-Known Member

Messages:: 1,634

Likes Received:: 62

Best Answers:: 0

Trophy Points:: 158

#1

Hello,

I am trying to create some trending out of my mysql data.
I have a table, lets call it "table" i also have a column, lets call it column.
The column has many rows (thousands) lets call them rows.

I want to find out which words are used the most. eg. "Morning", "Computers", "Tim".

The only thing is, this data is already in my database, I don't want to keep track of the count like a search query would by inserting or updating a number,.. at least i don't think it would work out since the data already exists ina column.

I want to search a specific column and find out the most popular words so i can print them out..

any ideas?...thanks!

timallard, Sep 28, 2009 IP

timallard Well-Known Member

Messages:: 1,634

Likes Received:: 62

Best Answers:: 0

Trophy Points:: 158

#2

any idea on this? thanks!

timallard, Sep 29, 2009 IP

plog Peon

Messages:: 298

Likes Received:: 11

Best Answers:: 1

Trophy Points:: 0

#3

I think the solution to this will be achieved programatically not via the database. Here's the general way I would solve this:

~set up 2 arrays: $master_words, $master_counts
~execute a simple query to extract all the data from the column you want to analyze
~loop through each row of query results and explode the column into an array of words $tmp_words
~loop through each $tmp_word element to see if that particular word exists
~~if it does -increment its corresponding $master_counts element
~~if not add it to $master_words and set its corresponding $master_counts element to 1
~when all query results have been cataloged loop through the $master_counts array to find out which element has the most
~use that index to retrieve the corresponding word from $master_words

plog, Sep 29, 2009 IP

ohteddy Member

Messages:: 128

Likes Received:: 2

Best Answers:: 0

Trophy Points:: 28

#4

If I understand correctly you have a table that looks like:

Tags
--------
apple
orange
peach
orange
orange

And the expected output would look like:

Count | Tag
---------+-------
3 | orange
1 | apple
1 | peach

The SQL would be (untested)

SELECT COUNT(tag) as count, tag FROM tags GROUP BY tag ORDER BY count;

If I've mis-understood the problem, perhaps you could give me an example.

ohteddy, Sep 29, 2009 IP

timallard Well-Known Member

Messages:: 1,634

Likes Received:: 62

Best Answers:: 0

Trophy Points:: 158

#5

Thanks guys I will try some things out from these answers.

- Ohteddy: curently the colomn I need to find similar words are sentences. e,g.

1. This is the first sentence i want to compare
2. how about we try another sentence?

out of this - the word sentence would be marked as having a higher weight or count.

hmmm

timallard, Oct 1, 2009 IP

ohteddy Member

Messages:: 128

Likes Received:: 2

Best Answers:: 0

Trophy Points:: 28

#6

I see, this is quite a bit different from what I thought.

The easiest solution I can think of is to build a histogram.
I'm not sure how I would do it in SQL with out defining a
custom function or using a temporary table. You will
probably want to write it in a language you're comfortable
with. Also, depending on the size of your data set, you
will may want to cache the results to disk.

If you're not familiar with histograms checkout: http://en.wikipedia.org/wiki/Histogram
If you have any problems implementing this let me know and I'll see
if I can help.

Last edited: Oct 1, 2009

ohteddy, Oct 1, 2009 IP

Log in or Sign up

Count Popular Words [Tag Cloud-ish]

timallard Well-Known Member

timallard Well-Known Member

plog Peon

ohteddy Member

timallard Well-Known Member

ohteddy Member

Useful Searches