Hey guys, I have been doing some readings about php today, and come up with a php script. I want to hear what you think about it, if any more improvements could be made etc. What my script does is quite simple, list links of almost all resources used in makin that website (images, js files, css files) and also all hyperlinks on that side. For now, you should hardcode the link to the script by hand, did’nt make a form or database about it. Here is my code, be judgmental
<?php
$address = ""; //HARDCODE a link here pls...
if($address == "") echo "pls hardcode the address, thx for understanding :)";
if(!preg_match("!^(https?)://!",$address)) $address = "http://" . $address; //if address doesnt start with http or https make it start that way
$address = preg_replace("!(/+)$!","",$address); //remove trailing slashes if any
$doc = new DOMDocument();
$doc->loadHTMLfile($address);
$elements = $doc->getElementsByTagName('*'); // get all elements
$chars = preg_split('!/!', $address, -1, PREG_SPLIT_NO_EMPTY); //split the address
if(preg_match("/\./",end($chars))) $file= true; else $file=false; //if last part contains dot, than this is a .html, .htm etc-> $file true
foreach($elements as $element) // process each element
{
foreach($element->attributes as $attribute) //for every attribute of element
{
if(in_array($attribute->name, array('href','src'))) //if attribute name is in array
{
$value=$attribute->value;
if(!preg_match("/^($address)/",$value) && !preg_match("!^#$!",$value) && !preg_match("!^(https?)://!",$value))
{
//goin in this "if" means $value needs to be rewrited :)
if($file && preg_match("!^(\.\./)!",$value)) // if it is file and $value starts with ../
{
$value= "../" . $value; // then make it twice so that link works
}
if(!preg_match("!^/!",$value)) // if value doesn't start with slash
{
$value= $address . "/" . $value;
}//if $value doesnt start with slash combine address and value using
else
{
$value= $address . $value; // if $value does start with / than just combine them
}
} //if $value does start with $address or is an external link or an # leave it as it is then echo
echo "<a href=\"" . $value . "\">" . $value . "</a><br />";
}
}
}
?>