PHP script to clean up table data
You just found yourself a great source of data from your favorite search engine. This data has been formatted as a table and you'd love to get a copy and use it in your application or display it on your page. You pull up the html source, the html table code that you see makes your stomach uneasy. It renders nicely in the browser, but this html code..... How will you get this great data out?
There are many ways to clean up or scrape this sort of data. I am going to discuss a quick and dirty way to get this data into php arrays and from there you can use it for your app or generate better html code for the table. This will be done by copying columns into separate lists and having php combine the lists as arrays referencing each individual list by index to grab the right values for each row/column. In firefox you can quickly copy a column by holding down the control ( ctrl ) key and then dragging a selection down the column of the table. Once the table cells you want are highlighted copy them and paste them into your script following the example.
<?php $cols = array( <<<ENDTEXT 75 103 60 13 ENDTEXT , <<<ENDTEXT 132 132 132 15 ENDTEXT , <<<ENDTEXT 12/11/2007 12/11/2007 12/11/2007 03/05/2010 ENDTEXT , <<<ENDTEXT 18,018 21,585 8,844 3,025 ENDTEXT , ); $col1 = explode("n",array_shift($cols)); $col1_count = count($col1); $arrs = array(); foreach($cols as $c) { $n = explode("n",$c); if ( count($n) != $col1_count ) { die('column length mismatch'); } array_push($arrs,$n); } $data = array(); foreach($col1 as $i => $v) { $v = trim($v); $n = array($v); foreach($arrs as $a) { array_push($n,$a[$i]); } array_push($data,$n); } var_dump($data); echo '<table border="1">'; foreach($data as $tr) { echo '<tr>'; foreach($tr as $td) { echo '<td>'. $td .'</td>'; } echo '</tr>'; } echo '</table>'; ?>
You will want to make sure you copied the right amount of data per column. This code will have populated the data array with arrays each representing one row of the table. You can take this one step further and generate a new more concise table that has much nicer code.
It's as easy as that.