Python Programming Challange: Create a Tag Cloud
October 25
We'll offer 3 months of free OpenSolaris VPS (up to 512 mb RAM) service for anyone that provides the most efficient algorithm (in the form of a Class) of a Tag Cloud generator. Input is a dictionary. The keys will be the tag name, the values of those keys are the "members" of the tag, from which you generate the tag count.
Please be sure to check with us before starting the challenge.
Update Oct 26th:
The original post was grossly non-verbose. Here are a little bit more details.
Input is a dictionary like:
tagdict = {"things": ["pen", "paper"], "animals": ["frog", "snake", "lion"]HTML Markup, example:
#tagcloud{
text-align:center;
}
#tagcloud .tiny {
font-family: Arial, Helvetica, sans-serif;
font-size: 12px;
color: #6699CC;
text-decoration:none;
}
#tagcloud .med {
font-family: Arial, Helvetica, sans-serif;
font-size: 13px;
color:#FF9900;
text-decoration:none;
}
#tagcloud .big {
font-family: Arial, Helvetica, sans-serif;
font-size: 16px;
color:#990000;
text-decoration:none;
}
#tagcloud .verybig {
font-family: Arial, Helvetica, sans-serif;
font-size: 18px;
color:#99CC00;
text-decoration:none;
}
#tagcloud a:hover{
color:#000;
}
<div id="tagcloud">...</div>Instantiating the class:
t = tagCloud(tagdict)
html = t.get_html
The output ("html" value) should be something like:
<a href="" class="tiny" id="things">things</a>
<a href="" class="med" id="animals">animals</a>
Any chance you could put up a file with some test data, also, how much data are we talking about?
Does this need to be able to scale-out rather than up?
Does it have to be real-time or this offline processing?
What kind of output are you expecting?
Have you tried any readily available solutions (assuming they exist), if so, which, and why aren't they good enough?
Also, "most efficient", is that runtime, memory, or both?
Do you have a reference implementation that we're supposed to beat, if so, does it have a test-suite we can use to verify our own algorithms?
And when's the deadline?
/Patrik
Just updated the blog post with more details. "most efficient" at both memory use as well as runtime.
Do not have a reference implementation. Have seen some solutions - we do not want ones that just play with the font-sizes through the % parameter (see HTML markup). This is offline processing.
Deadline is November 7th (since this isn't a difficult challenge, we've kept it short).
Of course, there shouldn't be bugs, so please do test. :)