Dr. Dobb's is part of the Informa Tech Division of Informa PLC

This site is operated by a business or businesses owned by Informa PLC and all copyright resides with them. Informa PLC's registered office is 5 Howick Place, London SW1P 1WG. Registered in England and Wales. Number 8860726.

Channels ▼

Web Development

Tag Clouds: Usability and Math

Source Data

As input for a tag cloud, you need a dataset consisting of at least three columns:

  • Text (to display).
  • Weight (to determine the font size).
  • Identifier (something to support navigation.)

The weight in the dataset often represents some frequency—the number of times a text is used as a search term, or the number of items sold of a product. However, the weight is not always an integer value. You can also consider, say, election results consisting of political parties and their percentages, earthquakes and their intensity, or movie stars and their IQ. In fact, there is technically little difference between tag clouds, histograms, line graphs, and pie charts. (I wouldn't be surprised to find the tag cloud as just another type of standard chart in Excel 2010.)

While constructing the source data for a tag cloud, you can impose restrictions on the raw data in the system in three ways:

  • Similar to other graphs, there is a restriction to the density of the information in tag clouds. According to Wikipedia, tag clouds generally contain between 30 and 150 tags. Usability clearly sets an upper limit to the number of tags. Moreover, the page layout can impose a restriction to the available space for the tag cloud. It is therefore necessary to take into account an imposed maximum length for the dataset.
  • Some texts may not be interesting to users and should be omitted from the tag cloud. This is the case for articles and other small words that are considered to be "noise" by search algorithms. If there are tags such as these in your data, you might want to filter the results.
  • Many tag clouds present information calculated over a period of time, such as the number of times that search terms have been used in the last 24 hours. Depending on the data, your function may contain extra parameters with which you restrict the aggregation of data to a (progressive) subset.

Eventually, you will create one or more functions that resemble Listing One. Your architecture for data access is hopefully more sophisticated than this simple example. But if you separate the construction of source data from the remaining functional layers of the tag cloud, then you already have a better design than the average tag cloud example found on the Internet.

Public Function GetWriters(ByVal maxCount As Integer, _
        ByVal ignoreNoise As Boolean, ByVal fromDate As DateTime, _
        ByVal toDate As DateTime) As DataTable
    Dim query As String = String.Format( _
        "SELECT * FROM (SELECT TOP {0} ID, Text, " & _
        "Count FROM Writers ORDER BY Count DESC) sub " & _
        "ORDER BY Text ASC", maxCount)
    'TODO: also filter on ignoreNoise, fromDate and toDate
    Dim adapter As New SqlDataAdapter(query, _ConnectionString)
    Dim table As New DataTable
    Return table
End Function

Listing One

Related Reading

More Insights

Currently we allow the following HTML tags in comments:

Single tags

These tags can be used alone and don't need an ending tag.

<br> Defines a single line break

<hr> Defines a horizontal line

Matching tags

These require an ending tag - e.g. <i>italic text</i>

<a> Defines an anchor

<b> Defines bold text

<big> Defines big text

<blockquote> Defines a long quotation

<caption> Defines a table caption

<cite> Defines a citation

<code> Defines computer code text

<em> Defines emphasized text

<fieldset> Defines a border around elements in a form

<h1> This is heading 1

<h2> This is heading 2

<h3> This is heading 3

<h4> This is heading 4

<h5> This is heading 5

<h6> This is heading 6

<i> Defines italic text

<p> Defines a paragraph

<pre> Defines preformatted text

<q> Defines a short quotation

<samp> Defines sample computer code text

<small> Defines small text

<span> Defines a section in a document

<s> Defines strikethrough text

<strike> Defines strikethrough text

<strong> Defines strong text

<sub> Defines subscripted text

<sup> Defines superscripted text

<u> Defines underlined text

Dr. Dobb's encourages readers to engage in spirited, healthy debate, including taking us to task. However, Dr. Dobb's moderates all comments posted to our site, and reserves the right to modify or remove any content that it determines to be derogatory, offensive, inflammatory, vulgar, irrelevant/off-topic, racist or obvious marketing or spam. Dr. Dobb's further reserves the right to disable the profile of any commenter participating in said activities.

Disqus Tips To upload an avatar photo, first complete your Disqus profile. | View the list of supported HTML tags you can use to style comments. | Please read our commenting policy.