Home
Name Modified Size InfoDownloads / Week
README.txt 2012-04-01 5.3 kB
WRS2ARFF.ps1 2012-03-30 24.0 kB
Totals: 2 Items   29.3 kB 0
README file for WRS2ARFF.PS1

QUICKSTART
	Start up the script, type SCAN, press return.  Load resultant ARFF file into WEKA (or whatever).
END OF QUICKSTART

What language is WRS2ARFF writen in?
	It is a Microsoft Powershell script
	
What OS's will it run on?
	Windows 7, Vista, XP (add-on), and possibly other OS's where Mono (.NET clone) runs.

What does it do?
	It watches the Wifi network interface and make a note of all the Wifi Base stations (BSSID or Base Station IDentifier) it can see and at what power they are seen at (RSSI - Received Signal Strength Indicator).  It then writes this information as a plain text file with the extention .ARFF.
	
So what use is an ARFF file?
	As well as the BSSID and RSSI information, the name of the computer running the script and a descriptor of the physical area, or room, that the computer is situated, is saved in the ARFF file.  The BSSID, RSSI, COMPUTER and ROOM information in the ARFF file can be read into a data mining program such as WEKA or Rapid Miner.  These programs use machine learning (ML) algorithms to find patterns in data, these patterns can be used to make associations.  For example, if it is known that a computer can see Access Points (APs) with BSSIDs of AP1, AP2 and AP3 when it is in room R1, and Access Points AP4, AP5 and AP6 when it is in room R2.  Then if another computer can see AP1, AP2 and AP3 then it is probably located in R1.  If we extend this idea to use not only whether an AP, or group of APs, can bee seen or not but also to what power level, or RSSI, they can be seen, then given some clever ML algorithms we can have a good stab at working out the location, or room, of a computer given knowledge about what BSSIDs and associated RSSI values it is registering.
	
	However, typically this Radio Frequency Fingerprinting as it's known requires training and a large body of knowledge about the expected RSSI values or BSSIDs in each and every room - this forms what is known as a Radio Map.  This is where the ARFF file comes in; it is basically a collection of BSSID and RSSI data pertaining to Rooms - many hundreds or thousands of readings.
	
How do I collect a hundred thousand BSSID/RSSI data?
	I'm glad you asked!  Well (finally) that is where WRS2ARFF comes in.  Run it on a WiFi enabled laptop and it enables you to collect hundreds of thousands of data easily, as an ARFF file.
	
How do I run it?
	Just start up Powershell and run from there, or as long as the file has a PS1 extension it will run-with-Powershell by right-clicking on it.  Once running it is text-based menu-driven console program.
	
How do I use it?
	Well at its simplest, just start up the script and type SCAN then press return.  This will take a number of measurements and write them to an ARFF file.  The number of measurements it takes can be altered by typing NUMBER, then a number, say 1000, then pressing return.
	
	
What else can I change?
	NAME m		-	Set the room name.
					E.g. NAME helpdesk-office
					Note no spaces allowed in the name m.
    
	NUMBER n	-	Where n is the number of scans to perform in this room.
					E.g. NUMBER 1000
					One scan may generate multiple readings, one reading per BSSID visible.
					Where it is one reading per ARFF line.
    
	DELAY t		-	Where t is the time in seconds between scans in this room.
					E.g. DELAY 5
					Would be 5 seconds per scan.  Setting t to 0 makes it go as fast as it can.
    
	HEADER h	-	Write the ARFF header every h scans.
					HEADER 500
					Sometimes useful if you can't wait for it to finish and want to copy the ARFF file for premature analysis.  Without this the copied ARFF file may contain data with elements not listed in the ARFF header, this would then not load into WEKA, for example.  The ARFF header is updated after this many scans and after the final scan.
    
	COMPUTER c	-	Gather readings from the target computer c.
					WRS2ARFF will try to gather information from remote machine c.  Machine c will need to be running an OS that reports Wifi RSSI and BSSID data by using NETSH WLAN, e.g. Windows 7.  Also PowerShell remote execution must be enabled on the remote machine.
	
	SCAN		-	Perform the scan of this room.
					Performs n scans with a t seconds delay between each scan, rewriting the ARFF header every h scans, to a file called f at path p.
					If CLEAR hasn't been enetered since the last SCAN command the new data will be appended, otherwise the ARFF file will first be cleared.
    
	CLEAR		-	Clear the existing ARFF file upon next scan.  
					Normally each SCAN appends data to any existing ARFF file, CLEAR will start afresh.
    
	PATH p		-	Set the output path p of the ARFF file.
    
	FILE f		-	Set the output filename f of the ARFF file (don't include .ARFF) 
	
	QUIT		-	Quits the script.
	
	
	Please feel free to email me with any questions, suggestions or improvements.  :-)
	
	Kind Regards,
	Shaun Dunmall
	sdunmall@gmail.com
	
	
	
	ARFF file
	http://www.cs.waikato.ac.nz/ml/weka/arff.html
	
	WEKA
	http://www.cs.waikato.ac.nz/ml/weka/
	
	RAPID MINER
	http://rapid-i.com/content/view/181/190/
	
	POWERSHELL
	http://en.wikipedia.org/wiki/Windows_PowerShell
	
	DATA MINING
	http://en.wikipedia.org/wiki/Data_mining
	
	
	
	
Source: README.txt, updated 2012-04-01