A first stab at a perl script to create Twitter friend/follow matrices

Geek alert: if the title of this post isn’t a dead giveaway I should tell you — unless you’re interested in APIs and badly-put-together bits of code — this probably isn’t for you.

I’ve recently found myself using a service provided by Damon Clinkscale called DoesFollow. All it does is answer the simple question “does twitter user A follow twitter user B?” Apart from a frill which lets you reverse the order of your question (“does twitter user B follow twitter user A?”) that’s all it does. You can even interrogate it from the address bar like this: http://doesfollow.com/barackobama/mediaczar

While I was thinking about how useful a service this is, I was suddenly struck by a moment of clarity. A lot of the research I’ve been doing could be simplified by something like this.

Quite often I want to find out whether MPs or congressmen or PR people follow each other on Twitter.

The way that I’ve been doing this until now is

make a list of the people who I’m interested in researching
for each person on that list, grab the list of all the Twitter people whom they follow
process the list so that only relationships between the people on the list show up

If all I’m doing is checking to see who follows whom, then this is a horribly wasteful way of doing things. The Twitter API limits the number of calls one can make on it — so this wastage leads to things taking much longer.

If only I could cycle all the names I want to check through something like DoesFollow!

Well – it turns out that I can. And in theory it’s not much harder than using DoesFollow. The Twitter API (which is what DoesFollow uses, after all) has a method called friendship/exists. All we have to do is send Twitter the following request:

http://twitter.com/friendships/exists.xml?user_a=barackobama&user_b=mediaczar

and it will come back with the answer:

<friends>true</friends>
or
<friends>false</friends>

Kludge-y perl code

(This fabulous picture courtesy of There, I Fixed It)

So I tried to do this using Yahoo! Pipes, but there are too many nested loops. You need to do something like this:

get list of names

for each user_a (in list) {

does friendship exist

}

There’s no easy way to get Pipes to do this, as far as I can see (I’ll keep trying, but if someone else can help, I’d be v. grateful.)

So I’ve pulled together a badly-written perl script to do the work for me.

The script

[code lang="perl"]
#!/usr/bin/perl
# checks the Twitter API to find the friendships between a list of usernames
# this should really use the NEW API call that would let us halve the number
# of API calls
# author: Mat Morrison
# date: Friday July 10, 2009
use warnings;
use LWP::Simple;
# set up variables
# we're just using a whitespace delimited list for the moment
my @usernames = qw(kerrymg mediaczar timhoang titusbicknell);
# let's build the matrix with a hash of hashes...
# to begin with, we'll include diagonal values -
# that is -- we'll check to see whether @mediaczar follows @mediaczar
foreach $user_a(@usernames) {
foreach $user_b(@usernames) {
# we should put in a conditional clause that will check for the diagonal values
# and not bother checking whether someone is a friend of themselves...
$url = 'http://twitter.com/friendships/exists.xml?user_a='
.$user_a
.'&user_b='
.$user_b;
# get XML file from Twitter -- it's an astonishingly simple XML file that reads
# true
# or
# false
# so we don't need to do much with it...
$follows = get $url;
die 'Can\'t get $url' unless defined $follows;
# strip the tags - I'm using a generic "HTML stripping" regex
$follows =~ s/<(.|\n)+?>//g;
# we should probably convert "true" values to 1 and "false" values to zero or blank
# now let's push data into the matrix
$matrix{$user_a}{$user_b} = $follows
}
}
# spit out the data as a tab-delimited table
# print the top line first
for $user_b ( keys %matrix ) {
print "\t$user_b";
}
# now print the values
# they're all neatly arranged in the matrix so we
# can just print them out sequentially
for $user_a ( keys %matrix ) {
print "\n$source";
for $follows ( keys %{ $matrix{$user_a} } ) {
print "\t$matrix{$user_a}{$follows} ";
}
}
print "\n";
[/code]

Where next?

Most of my thinking is included above in the code comments. An obvious mistake I’m making is checking to see whether, say, @mediaczar follows @mediaczar. That wastes n API calls per search. But a more serious mistake is not to be using the new friendships/show method. Because it tells you whether user A follows user B and whether user B follows user A at the same time, it would save me lots of API calls. How many lots? Well take a look at this.

This is what I’m doing at the moment — checking each and every cell in the matrix:

clumsy API call matrix

This is what I’d be doing if I removed the diagonals:

Matrix with diagonals removed

And this is what I’d be doing if I used the newer API call:

Matrix using the new API call

I had to look up the formula for working this out without colouring in little boxes. With a little tweaking (to prevent the diagonals from creeping back in), here it is:

((n-1)^2)+n-1)/2

So — for a list of congress people (159 on twitter as at Tuesday July 14, 2009) that’d be ((156-1)^2-1+156)/2 = 12,090 API calls. Which is still a lot and will require some careful throttling, but (literally) not half as many as the 156^2 = 24,336 API calls that I’d need to run it as the script currently stands.

So – back to the drawing board for a while. I really can’t work out a programmatic way of doing this. Hmph.

A first stab at a perl script to create Twitter friend/follow matrices

Kludge-y perl code

The script

Where next?

Trending Articles

Practice Sheet of Right form of verbs for HSC Students

Download: FK ft Shenky – Nakuyewa ”Prod by: Shenky”

How to win at Markstrat (Markstrat Tips and Tricks) – Vodites

Ominde Commission Report and Recommendations – Ominde Report of 1964

Bureau of Internal Revenue: Regional Offices (Directory)

GO 53 on Enhancement of Ex-gratia upto 5 Lakhs Toddy Tappers in Telangana

Cakewalk CA-2A Leveling Amplifier v2.0.1.97 WiN, v2.0.1.96 OSX Incl Keygen

Mp3 Download: Mdu - Kunjenjenjena

How the kill the job , when DTP request running for long hours.

Microsoft Intune から展開しているアプリのアップデートについて

18-year-old girl was beaten for half an hour by two Northampton men in 'an...

Car crash in Dunton Bassett leaves driver in critical condition

Macky 2, Two Others In Road Accident

Application log 00000000000000089514: Could not convert queue DLVST90CLNT

Detroit mafia: D’Anna Brothers agree to plea deal

Delivery block field greyed out using VA02

Muloraki Au

【個人撮影】スマホのプライベート映像♪「中に出さないで///」カラオケ屋での生ハメ撮りが流出ｗ【リベンジポルノ】＠PornHub

BREAKING NEWS: Diamond Platnumz Is Reported Dead After Ghastly Car Accident

FIAT 500 B0111 B0112