Tuesday, April 22, 2008

Network Diversity Index Redux

Thanks to Darren Draper for taking a look at a suggestion I had made for network analysis in a previous post. Hopefully this is not a breach in "blog etiquette" , but my response to his comment was rather long so I entered it as a post instead.
Here was Darren's comment:

OK, so I used the Shannon index calculator to learn that my H1 = 0.9088. But what does that mean?

I'm assuming that an H value of 1 means that your population is not diverse - at least not diverse when considering the different kinds of populations assigned (which are arbitrary and subject to bias).

Here's a screenshot of what I've entered (as you can see, I mostly use Twitter to connect with the ed-tech community). http://tinyurl.com/6lupxe

Darren, most biological communities will have a diversity index between 1.0 and 4.0. Your "community", with an index of .9088 indicates, on the surface, very little diversity. This would be what classical ecologists might call a "typal" community, like "grassland". In terms of your network, most of your information is coming from a single "species" called EdTech. Example: An established, mid-latitude ecosystem with limiting resources and most of them passing through a large number of very few or even a single species. The other species in this community, and there may be many, are represented by maybe only a single individual in the sample. You might say "Well yeah, its an EdTech community!' Low diversity in a network, to me, equates with focussed, but low quality (depth) information. Let's say you use Wordpress for your CMS, so you have a number of EdTech people using Wordpress in your network. If you added a few members of the Wordpress Codex community you might also pick-up information that may be of use to you.

Two observations:
One. If we consider this assessment to be correct, then, in conjunction with your discussion of Twitter Set Theory, you should be able to reduce the number of individuals in your network without reducing information content. Your EdTech species has a population of 257 competing for a resource, your time. Assume a 1 in 10 overlap in your EdTech set, you could effectively reduce the number of individuals in your EdTech population to 25-30, increasing efficiency and not degrading information. You might say at this point, "I've come to rely on my connection to more than 30 individuals in this group. How can I eliminate any one?" This brings me to observation two.

I believe your diversity is really higher than reported. I said "on the surface" earlier because I think the problem is in identifying a "species" in our analogy. If all the members in your EdTech population were giving you the same information, competition would have reduced their number before now (my guess is their number is growing). Case in point. Three different species of Anole lizard were observed in a certain tree of a Caribbean island. This couldn't happen because similar species couldn't occupy the same niche for very long without competition favoring one over the other two. Closer inspection revealed that each of them was occupying a very specific part of the tree and feeding on very specific prey in that area. Thus, they were not in competition with each other and were occupying a different role (niche) in the community. I believe closer scrutiny of your EdTech population will really reveal very distinct "species" exist within this group.

Biologists identify species using a key based on a dichotomy (dichotomous key). An organism is assessed as having a described character, which places it into one group or lacking that character which places it in another group. A new character is describe an the assessment continues in branching fashion until the "species" is identified (keyed out) by the set of accumulated characters. I've begun an attempt at this on a wiki but this is a developing idea much like the issue of "tagging". It will take time. One thing that might help is for people to give as much information in their profiles as they can comfortably give.

Of course, most of this is hypothetical and may be based on untested assumptions, but, if networks are going to be an important part of how we use the technology, then I think some metrics need to be established for assessing them.

Thanks again, Darren for the conversation.


Darren Draper said...

Thank you, Jeff, for the response. For the record, responding to comments with a new post are completely in line with blogging etiquette. It's your blog, you make the rules as to how you respond to your readers' comments.

Now, I think you're correct on two levels. First, my network does appear to lack diversity. Second, my network's diversity is actually greater than it appears.

The problem does lie in how I have classified the members of my network. While I agree that I could likely increase efficiency while not degrading information by reducing the number of edtech members that I follow (or by creating additional sub-classifications), the trick becomes determining exactly which folks to drop (and hoping that feelings aren't hurt in the process, a discussion that takes on multiple dimensions).

I like the ideas you've put forth here and will have to think more on the subject.

Sue Waters said...

Actually Jeff you've shown excellent blog etiquette as Darren points out. As he highlights your blog your rules however always good blogging practice to respond back in comments to comments by readers on a post and when your comments are too long then writing a post to respond is an excellent follow up.

And to be extra sneaky (and earn bonus points) track Darren's comments on cocomment and you know what he is commenting on.

As a scientist myself I'm not sure we can apply these rules to our online networks -- perhaps there are alternative statistical measures that could be used?

Or perhaps the issues lies in identifying the different groups? Take for example your species - designer - each of those probably needs to be broken apart to their own unique species, then expanded, as they bring totally difference expertise to the mixture. What about the educators - primary, secondary, vocational and tertiary educators all are uniques and offer totally different view points?

Alternatively maybe Darren needs to look at the mix more closely in his network. I've got non-profits, educators, programmers, web designers, people from companies (Web 2.0, IT and phone companies), people from non-English speaking countries which all provide more diversity than just educators with ed tech backgrounds.

My girlfriend did exactly what you suggest regarding Wordpress tip. She wanted to keep her twitter network smaller -- so she deliberately went through and hand picked people from specific industry areas.

I don't believe Darren can reduce his network to 25-30 people and still obtain the same information because that is the magic with social networks. You never totally know where that amazing tip or suggestion is going to come from. Besides which 25-30 people is way too low; if you said 150-200 then maybe I would agree.

Jeff said...

@darren I,too, am going to think on this a while longer. Compared to most, my network is relatively small. I am simply going to track diversity over time and work on the species key.
@Sue Thanks for joining the conversation. You're correct that this analysis won't work on networks unless the data is valid.
Having telephone people in your network, you may know that this analysis' came out of Bell Labs in the 40's in an effort to deal with signal "noise". Even at the risk of losing serendipitous discovery, most of a large network is going to be noise. Selectively pruning a network can boost signal to noise ratio.
But wait! I only say that to prod discussion. I too am fond of the social aspect of the network and would sorely miss dnorman's cycling exploits, utecht's gourmet screencasts and sujokat's "good night tweeties" (as I am rising) all for the sake of efficiency.

Anonymous said...

It is rather interesting for me to read this post. Thanx for it. I like such themes and anything that is connected to them. I definitely want to read more soon.
Phone blocker