PHP脚本在将数据从CSV导入MySQL时保持超时
I'm stumped on this one. I'm a little above beginner-level with PHP/MySQL and very new with posting on this site. GoDaddy switched me over to a grid server to boost performance and shed light on a problem with the way I have coded a script. It grabs data out of a CSV and attempts to insert into a normalized database.
The CSV is not normalized so there is a lot of checking to see if something exists. I originally had it opening/closing result sets, but then it was suggested to me to use prepared statements and unfortunately I have run into the same problem. I can get through about 1200 of 14k records before getting the broad "Internal Server Error". The error in the log references a security feature where it prevents hitting the FastCGI server too much in a short period of time.
What I'm trying to find out and learn is the proper method to accomplish what I'm trying to do -- check to see if something exists; if it does, get the record ID. If not, insert the data and get the new ID. My code is below. It gets the file name and a hidden attribute from a simple php file upload form and starts going from there. This will only be used by me and the data I'm inserting is public record so security isn't a major concern.
<?php
if ($_POST["upload"] == "1") {
//Connect to the database
$hostname = xxx;
$username = xxx;
$dbname = xxx;
$password = xxx;
$dbh = mysqli_connect($hostname,$username,$password,$dbname) or die("Problem connecting: ".mysqli_error());
$stmt = mysqli_stmt_init($dbh);
//check for file errors
if ($_FILES["file"]["error"] > 0)
{ echo "Return Code: " . $_FILES["file"]["error"] . "<br>"; }
//No file errors
else
{
//If file already exists
if (file_exists($_FILES["file"]["name"]))
{
echo $_FILES["file"]["name"] . " already exists.";
exit;
}
//If it doesn't exist
else
{
move_uploaded_file($_FILES["file"]["tmp_name"],
$_FILES["file"]["name"]);
echo "Stored in: " . $_FILES["file"]["name"] . "<br><br>";
$strFileName = $_FILES["file"]["name"];
}
}
//File reporting
echo "Upload: " . $_FILES["file"]["name"] . "<br>";
echo "Type: " . $_FILES["file"]["type"] . "<br>";
echo "Size: " . ($_FILES["file"]["size"] / 1024) . " kB<br>";
echo "Temp file: " . $_FILES["file"]["tmp_name"] . "<br>";
$row = 0;
if (($handle = fopen($strFileName, "r")) !== FALSE) {
while (($data = fgetcsv($handle, 1000, ",")) !== FALSE) {
$num = count($data);
$row++;
$strPermitNo = trim($data[0]);
//Check to see if the permit is already in the database
$sql = "SELECT LocID FROM tbl_TABC_Locations WHERE LocPermitNo = ?";
if (mysqli_stmt_prepare($stmt, $sql))
{
mysqli_stmt_bind_param($stmt, "s", $strPermitNo);
mysqli_stmt_bind_result($stmt, $intLocID);
mysqli_stmt_execute($stmt);
$strPermitResult = 0;
while (mysqli_stmt_fetch($stmt))
{
$strPermitResult = $intLocID;
}
}
//If no permits, insert it
if ($strPermitResult == "0")
{
//Clean Location name
$strLocName = trim($data[1]);
$strLocName = str_replace('"', "", $strLocName);
$strLocName = str_replace(";","-", $strLocName);
$strLocName = addslashes($strLocName);
$strInsertQuery = "INSERT INTO tbl_TABC_Locations (LocName,LocAddress,LocCity,LocState,LocZip,LocCounty,LocPhone,LocPermitNo) VALUES (?, ?, ?, ?, ?, ?, ?, ?)";
if (mysqli_stmt_prepare($stmt, $strInsertQuery))
{
mysqli_stmt_bind_param($stmt, 'ssssiiis', $field1, $field2, $field3, $field4, $field5, $field6, $field7, $field8);
$field1 = $strLocName;
$field2 = trim(addslashes($data[2]));
$field3 = trim(addslashes($data[3]));
$field4 = trim($data[4]);
$field5 = trim($data[5]);
$field6 = trim($data[6]);
$field7 = trim($data[7]);
$field8 = $strPermitNo;
mysqli_stmt_execute($stmt);
$intLocID = mysqli_insert_id($dbh);
}
}
else
{
$intLocID = $strPermitResult;
}
//Report dates
$strReportDate = trim($data[8]);
$aryNewDate = explode("/", $strReportDate);
$strNewYear = $aryNewDate[0];
$strNewMonth = $aryNewDate[1];
//Check to see if the report date is already in there
$sql = "SELECT ReportDateID FROM tbl_TABC_ReportDates WHERE ReportYear = ? AND ReportMonth = ?";
if (mysqli_stmt_prepare($stmt, $sql))
{
mysqli_stmt_bind_param($stmt, "ii", $strNewYear, $strNewMonth);
mysqli_stmt_bind_result($stmt, $intReportDateID);
mysqli_stmt_execute($stmt);
$strReportDateResult = 0;
while (mysqli_stmt_fetch($stmt))
{
$strReportDateResult = $intReportDateID;
}
}
if ($strReportDateResult == "0")
{
$strInsertQuery = "INSERT INTO tbl_TABC_ReportDates (ReportMonth,ReportYear) VALUES (?, ?)";
if (mysqli_stmt_prepare($stmt, $strInsertQuery))
{
mysqli_stmt_bind_param($stmt, "ii", $field1, $field2);
$field1 = $strNewMonth;
$field2 = $strNewYear;
mysqli_stmt_execute($stmt);
$intDateID = mysqli_insert_id($dbh);
}
}
else
{
$intReportDateID = $strReportDateResult;
}
//Check to see if they have reported for the month already, and if not, add the report
$sql = "SELECT ReportID FROM tbl_TABC_Reports WHERE ReportDateID = ? AND LocID = ?";
if (mysqli_stmt_prepare($stmt, $sql))
{
mysqli_stmt_bind_param($stmt, "ii", $intReportDateID, $intLocID);
mysqli_stmt_bind_result($stmt, $intReportID);
mysqli_stmt_execute($stmt);
$strReportIDResult = 0;
while (mysqli_stmt_fetch($stmt))
{
$strReportIDResult = $intReportID;
}
}
if ($strReportIDResult == "0")
{
$strInsertQuery = "INSERT INTO tbl_TABC_Reports (LocID,ReportDateID,TaxReceipts) VALUES (?, ?, ?)";
if (mysqli_stmt_prepare($stmt, $strInsertQuery))
{
mysqli_stmt_bind_param($stmt, "iid", $field1, $field2, $field3);
$field1 = $intLocID;
$field2 = $intReportDateID;
$field3 = trim($data[9]);
mysqli_stmt_execute($stmt);
echo "New report<br>
";
}
}
else { echo "<b>Already reported</b><br>"; }
}
echo "Closing file now";
fclose($handle);
}
mysqli_close($dbh);
}
The error from the log is this:
[2436594] [fcgid:warn] (104)Connection reset by peer: [client xxx] mod_fcgid: error reading data from FastCGI server, referer (my webpage address)
[fcgid:warn] (104)Connection reset by peer: [client xxx] mod_fcgid: ap_pass_brigade failed in handle_request_ipc function, referer (my webpage address)
Edit 12/15 (Pulled prepared statements outside of loop). Now I'm still getting "Number of variables don't match in prepared statement" errors:
$sql1 = "SELECT LocID FROM tbl_TABC_Locations WHERE LocPermitNo = ?";
$ps_ChkPermit = mysqli_stmt_prepare($stmt, $sql1);
if ($ps_ChkPermit)
{
mysqli_stmt_bind_param($stmt, "s", $strPermitNo);
mysqli_stmt_bind_result($stmt, $intLocID);
mysqli_stmt_execute($stmt);
...
}
In general, you want to remove as much computation from within a loop to outside of the loop. Because every line of code gets executed over and over again during each iteration of the loop.
@MikeW's suggesting that you try to slim down your script's resource requirements as much as possible. Consider the following example (untested code!):
while (($data = fgetcsv($handle, 1000, ",")) !== FALSE) {
$sql = "SELECT LocID FROM tbl_TABC_Locations WHERE LocPermitNo = ?";
if (mysqli_stmt_prepare($stmt, $sql)) // <-- this keeps running every time.
{
mysqli_stmt_bind_param($stmt, "s", $strPermitNo);
mysqli_stmt_bind_result($stmt, $intLocID);
mysqli_stmt_execute($stmt);
...
}
}
Why prepare your SQL statement over and over if its going to be the same every time?
$sql = "SELECT LocID FROM tbl_TABC_Locations WHERE LocPermitNo = ?";
// this statement gets prepped outside the loop and runs only once.
$prepared_statement = mysqli_stmt_prepare($stmt, $sql);
// loop starts...
while (($data = fgetcsv($handle, 1000, ",")) !== FALSE) {
if ($prepared_statement)
{
// and then you simply bind the params and execute from within the loop.
mysqli_stmt_bind_param($stmt, "s", $strPermitNo);
mysqli_stmt_bind_result($stmt, $intLocID);
mysqli_stmt_execute($stmt);
...
}
}
This way, you save on resources especially when you have to process many lines in your CSV.
Of course, it will mean that you need different variable names for different queries so you can identify each one. Do this right around where you currently declare $stmt
The second part of his suggestion involves a little more work. To reduce the number of queries, you can create a new field in the database which includes both the YEAR and MONTH and set this as a UNIQUE
index in mysql. That way if you try to insert an existing record, mysql will throw and error.
If there is an error while inserting, you can assume you have a report for that date. If you don't have an error then the report is new.
Then you don't have the extra step of preparing yet another query just to check for a report's existence!
As I pointed out above, you can also reduce the CSV file sizes so it doesn't take too long to complete.
Further to @halfer's suggestion, it might be simpler to run this script with PHP-CLI. No memory limit timeouts - but this will mean you need to save the uploaded files somewhere and use cron tasks to process them later...
Requires some familiarity with the command line :)
Hope this clears stuff up a little... Good luck!