Did you ever wanted to make sure that a name which is used for identification is some way, probably as a function name or as tablename is valid?
Here is a solution that will support:
- the tablename convention of PostgreSQL, MySQL and probably others too
- the function-naming of php
- the variable-naming of php
- probably a lot more out there
Here is the specification of PostgreSQL regarding tablenames:
PostgreSQL: Documentation: Manuals: PostgreSQL 8.4: Lexical Structure
Take a look in the sourcecode for further information.
I give you three functions to validate identifiers and to display the mistakes within invalid identifiers.
Some credits go to mki on this page:
non-Latin letters in bundle (content type) / field names | drupal.org
PHP-Code:
/**
* Misc functions
*
* @author Herbert Walde alias Atomic
*/
class Functions{
/**
* Returns whether an identifier is valid
* An valid identifier, can be used as:
* - tablename
* - php function name
* - php variable name
* - and much more
* A valid Identifier starts with a letter or underscore, followed by any number of letters, numbers, or underscores.
* Whether the letters are uppercase or lowercase doesn't matter
* Letters can also be letters with diacritical marks and non-Latin letters!
* Here is a example:
* <code>
* ////////////////////////////////////////////////////////////////////////////////////////////////
* // EXAMPLE //
* ////////////////////////////////////////////////////////////////////////////////////////////////
*
* $identifiers = array();
* //valid identifiers:
* $identifiers[]="äsdsd";
* $identifiers[]="àaàsas";
* $identifiers[]="_àaàsas";
*
* //invalid identifiers:
* $identifiers[]="'àaàsas"; //' is not allowed at all
* $identifiers[]="\"àaàsas"; //" is not allowed at all
* $identifiers[]="àaà'sas"; //' is not allowed at all
* $identifiers[]="àaà\"sas"; //" is not allowed at all
* $identifiers[]=" àaàsas"; //whitespaces at the beginning are not allowed
* $identifiers[]="àaàsas "; //whitespaces at the end are not allowed
* $identifiers[]="\$àaàsas"; //$ is not allowed at all
* $identifiers[]="\àaàsas"; //\ is not allowed at all
* $identifiers[]="àaà sas"; //whitespaces in between are not allowed
* $identifiers[]=" à aà sa&s"; //Multiple mistakes
*
*
* echo "<html>\n";
* echo "<head>\n";
* echo "\t<meta http-equiv=\"content-type\" content=\"text/html; charset=UTF-8\">\n";
* echo "</head>\n";
* echo "<body>\n";
* echo "<ul>\n";
*
* $new_chars = strlen("<font color=\"red\"><u><strong>"."</strong></u></font>");
* foreach($identifiers as $indentifier){
* if(Functions::isValidIdentifier($indentifier)){
* echo "\t<li><font color=\"green\">Identifier \"".$indentifier."\" is a valid! :-)</font></li>\n";
* } else {
* $invalid_chars = Functions::getInvalidIdentifierChars($indentifier);
* $invalid_offsets = Functions::getInvalidIdentifierOffsets($indentifier);
* $add_to_offset=0;
* foreach($invalid_offsets as $offset){
* $offset += $add_to_offset;
*
* $indentifier = substr($indentifier, 0, $offset)."<font color=\"red\"><u><strong>".$indentifier[$offset]
* ."</strong></u></font>".substr($indentifier, $offset+1);
*
* $add_to_offset += $new_chars;
* }
* $invalid_chars = str_replace(" ","WHITSPACE",implode(",",$invalid_chars));
* echo "\t<li>Identifier \"".$indentifier."\" is a invalid! :-( Invalid chars: ".$invalid_chars."</li>\n";
* }
* }
*
* echo "</ul>\n";
* echo "</body>\n";
* echo "</html>\n";
* </code>
* This code produces the following output:
* <pre>
* <html>
* <head>
* <meta http-equiv="content-type" content="text/html; charset=UTF-8">
* </head>
* <body>
* <ul>
* <li><font color="green">Identifier "äsdsd" is a valid! :-)</font></li>
* <li><font color="green">Identifier "àaàsas" is a valid! :-)</font></li>
* <li><font color="green">Identifier "_àaàsas" is a valid! :-)</font></li>
* <li>Identifier "<font color="red"><u><strong>'</strong></u></font>àaàsas" is a invalid! :-( Invalid chars: '</li>
* <li>Identifier "<font color="red"><u><strong>"</strong></u></font>àaàsas" is a invalid! :-( Invalid chars: "</li>
* <li>Identifier "àaà<font color="red"><u><strong>'</strong></u></font>sas" is a invalid! :-( Invalid chars: '</li>
* <li>Identifier "àaà<font color="red"><u><strong>"</strong></u></font>sas" is a invalid! :-( Invalid chars: "</li>
* <li>Identifier "<font color="red"><u><strong> </strong></u></font>àaàsas" is a invalid! :-( Invalid chars: WHITSPACE</li>
* <li>Identifier "àaàsas<font color="red"><u><strong> </strong></u></font>" is a invalid! :-( Invalid chars: WHITSPACE</li>
* <li>Identifier "<font color="red"><u><strong>$</strong></u></font>àaàsas" is a invalid! :-( Invalid chars: $</li>
* <li>Identifier "<font color="red"><u><strong>\</strong></u></font>àaàsas" is a invalid! :-( Invalid chars: \</li>
* <li>Identifier "àaà<font color="red"><u><strong> </strong></u></font>sas" is a invalid! :-( Invalid chars: WHITSPACE</li>
* <li>Identifier "<font color="red"><u><strong> </strong></u></font>à<font color="red"><u><strong> </strong></u></font>aà<font color="red"><u><strong> </strong></u></font>sa<font color="red"><u><strong>&</strong></u></font>s" is a invalid! :-( Invalid chars: WHITSPACE,WHITSPACE,WHITSPACE,&</li>
* </ul>
* </body>
* </html>
* </pre>
*
* @see Functions::getInvalidIdentifierChars()
* @see Functions::getInvalidIdentifierOffsets()
* @param string identifier
* @return boolean yes/no
* @access public
*/
static function isValidIdentifier($identifier){
return preg_match("#^[a-zA-Z_\x7f-\xff][a-zA-Z0-9_\x7f-\xff]*$#i", $identifier);
}
/**
* Returns the faulty chars of an identifier (Take a look at the isValidIdentifier for further information!)
*
* @see Functions::isValidIdentifier()
* @param string the faulty identifier
* @return array index-array containing the invalid chars
*/
function getInvalidIdentifierChars($identifier){
$matches=array();
$offset=0;
preg_match_all("#^([^a-zA-Z_\x7f-\xff])[a-zA-Z0-9_\x7f-\xff]*$#i", $identifier, $matches);
if(count($matches[1])>0){
return $matches[1];
} else {
$matches=array();
$offset=0;
preg_match_all("#([^a-zA-Z0-9_\x7f-\xff])#i", $identifier, $matches);
if(count($matches[1])==0){
trigger_error("Did not found faulty char!");
} else {
return $matches[1];
}
}
}
/**
* Returns the positions of the faulty chars of an identifier (Take a look at the isValidIdentifier for further information!)
*
* @see Functions::isValidIdentifier()
* @param string the faulty identifier
* @return array index-array containing the string-offsets of the invalid chars
*/
function getInvalidIdentifierOffsets($identifier){
$matches=array();
$offset=0;
$invalid_offsets=array();
preg_match_all("#^([^a-zA-Z_\x7f-\xff])[a-zA-Z0-9_\x7f-\xff]*$#i", $identifier, $matches);
if(count($matches[1])>0){
foreach ($matches[1] as $match){
$offset=strpos($identifier, $match, $offset);
$invalid_offsets[]=$offset;
$offset++;
}
} else {
$matches=array();
$offset=0;
preg_match_all("#([^a-zA-Z0-9_\x7f-\xff])#i", $identifier, $matches);
if(count($matches[1])==0){
trigger_error("Did not found faulty char!");
} else {
foreach ($matches[1] as $match){
$offset=strpos($identifier, $match, $offset);
$invalid_offsets[]=$offset;
$offset++;
}
}
}
return $invalid_offsets;
}
}
////////////////////////////////////////////////////////////////////////////////////////////////
// EXAMPLE //
////////////////////////////////////////////////////////////////////////////////////////////////
$identifiers = array();
//valid identifiers:
$identifiers[]="äsdsd";
$identifiers[]="àaàsas";
$identifiers[]="_àaàsas";
//invalid identifiers:
$identifiers[]="'àaàsas"; //' is not allowed at all
$identifiers[]="\"àaàsas"; //" is not allowed at all
$identifiers[]="àaà'sas"; //' is not allowed at all
$identifiers[]="àaà\"sas"; //" is not allowed at all
$identifiers[]=" àaàsas"; //whitespaces at the beginning are not allowed
$identifiers[]="àaàsas "; //whitespaces at the end are not allowed
$identifiers[]="\$àaàsas"; //$ is not allowed at all
$identifiers[]="\\àaàsas"; //\ is not allowed at all
$identifiers[]="àaà sas"; //whitespaces in between are not allowed
$identifiers[]=" à aà sa&s"; //Multiple mistakes
echo "<html>\n";
echo "<head>\n";
echo "\t<meta http-equiv=\"content-type\" content=\"text/html; charset=UTF-8\">\n";
echo "</head>\n";
echo "<body>\n";
echo "<ul>\n";
$new_chars = strlen("<font color=\"red\"><u><strong>"."</strong></u></font>");
foreach($identifiers as $indentifier){
if(Functions::isValidIdentifier($indentifier)){
echo "\t<li><font color=\"green\">Identifier \"".$indentifier."\" is a valid! :-)</font></li>\n";
} else {
$invalid_chars = Functions::getInvalidIdentifierChars($indentifier);
$invalid_offsets = Functions::getInvalidIdentifierOffsets($indentifier);
$add_to_offset=0;
foreach($invalid_offsets as $offset){
$offset += $add_to_offset;
$indentifier = substr($indentifier, 0, $offset)."<font color=\"red\"><u><strong>".$indentifier[$offset]
."</strong></u></font>".substr($indentifier, $offset+1);
$add_to_offset += $new_chars;
}
$invalid_chars = str_replace(" ","WHITSPACE",implode(",",$invalid_chars));
echo "\t<li>Identifier \"".$indentifier."\" is a invalid! :-( Invalid chars: ".$invalid_chars."</li>\n";
}
}
echo "</ul>\n";
echo "</body>\n";
echo "</html>\n";
Produces the following output:
HTML-Code:
<html>
<head>
<meta http-equiv="content-type" content="text/html; charset=UTF-8">
</head>
<body>
<ul>
<li><font color="green">Identifier "äsdsd" is a valid! :-)</font></li>
<li><font color="green">Identifier "àaàsas" is a valid! :-)</font></li>
<li><font color="green">Identifier "_àaàsas" is a valid! :-)</font></li>
<li>Identifier "<font color="red"><u><strong>'</strong></u></font>àaàsas" is a invalid! :-( Invalid chars: '</li>
<li>Identifier "<font color="red"><u><strong>"</strong></u></font>àaàsas" is a invalid! :-( Invalid chars: "</li>
<li>Identifier "àaà<font color="red"><u><strong>'</strong></u></font>sas" is a invalid! :-( Invalid chars: '</li>
<li>Identifier "àaà<font color="red"><u><strong>"</strong></u></font>sas" is a invalid! :-( Invalid chars: "</li>
<li>Identifier "<font color="red"><u><strong> </strong></u></font>àaàsas" is a invalid! :-( Invalid chars: WHITSPACE</li>
<li>Identifier "àaàsas<font color="red"><u><strong> </strong></u></font>" is a invalid! :-( Invalid chars: WHITSPACE</li>
<li>Identifier "<font color="red"><u><strong>$</strong></u></font>àaàsas" is a invalid! :-( Invalid chars: $</li>
<li>Identifier "<font color="red"><u><strong>\</strong></u></font>àaàsas" is a invalid! :-( Invalid chars: \</li>
<li>Identifier "àaà<font color="red"><u><strong> </strong></u></font>sas" is a invalid! :-( Invalid chars: WHITSPACE</li>
<li>Identifier "<font color="red"><u><strong> </strong></u></font>à<font color="red"><u><strong> </strong></u></font>aà<font color="red"><u><strong> </strong></u></font>sa<font color="red"><u><strong>&</strong></u></font>s" is a invalid! :-( Invalid chars: WHITSPACE,WHITSPACE,WHITSPACE,&</li>
</ul>
</body>
</html>