School > INFO 424 > Group Project P1

P1 - Topic and Team Members

http://students.washington.edu/pfluga/INFO424/

Team
Ahn
Adam
Paul
Zach

Our general area of interest for this project is the problem of visualization of traffic patterns on websites, specifically visualization by the users within the context of the organization who owns the site. We hope to facilitate exploration of large amount of otherwise very hard to understand data in order to gain useful insight about the behavior of a site's visitors.

There are a number of existing tools that exist in this space, including Mint, Webalizer, Analog, and AWStats. We believe however that there is significant room for improvement in these products. In particular we would cite the fact that when they use visualization at all, they use them very sparing and in relatively simple applications (traffic plotted against time for example). Perhaps even more significantly, these applications do not make use of dynamic interfaces and user interaction in any real way. We believe that by leveraging user interaction we will be able to provide a much more rich set of tools for exploring data than the limited statistical measures provided in these applications.

Because of the time restraints for this project we feel it would be best to complete a thorough design phase, rather than attempting to try to rush through both design and implementation. To this end we will be creating a paper implementation of the solution, and focusing more heavily on user testing and research.

The obvious source for our data would be web server traffic logs. Nearly all sites already collect and archive this information, it provides a great deal of depth for analysis, and it is already used by the other tools of this type already on the market. However one potential problem is that to be useful large amounts of data must be transformed from lists of file requests to aggregate data of some kind. While not a particularly difficult problem it is nevertheless very time consuming. Because we are trying to avoid the actual coding details of the project, and the scale of this data makes it not feasible to manually perform these transformation, we may find it useful to create test data in an already processed form for our application to use - leaving the actual implementation of code to translate the raw data into this form as a fairly trivial implementation detail.

We believe that through careful design we can enable site developers, marketing analysts, and other organizational members to have greater access to the information about how their sites are being used.