使用php将文本行解析为不同的变量
I am very new to php so I apologize for the seemingly simple question. I need to parse a line of text into different variables. More specifically, I need to parse many lines of text in different arrays. The line of text would resemble the following
timeStamp UserName* garbage text Number x item*
timeStamp UserName* garbage text Number x item*
timeStamp UserName* garbage text Number x item*
both userName and item could contain spaces. I would assume the best way to go about this would be 4 different arrays?
actual data would look like the following
03:12:34 mhopkins321 has acquired 5 x bottles of water
09:38:01 Nick Smith has acquired 100 x pennies
23:22:59 Fancy Frank has acquired 15684 x artichoke hearts
So I would assume the arrays would be
$timeStamp $userName $amount $items
03:12:34 mhopkins321 5 bottles of water
09:38:01 Nick Smith 100 pennies
23:22:59 Fancy Frank 15684 artichoke hearts
我是php的新手,所以我为这个看似简单的问题道歉。 我需要将一行文本解析为不同的变量。 更具体地说,我需要解析不同数组中的许多行文本。 文本行类似于以下 p>
timeStamp UserName *垃圾文本编号x项目*
timeStamp用户名*垃圾文本编号x项目*
timeStamp用户名*垃圾文本编号x项目 *
code> pre>
userName和item都可以包含空格。 我假设最好的方法是4个不同的数组? p>
实际数据如下所示 p>
03: 12:34 mhopkins321已经获得了5 x瓶水
09:38:01尼克史密斯获得了100 x便士
23:22:59 Fancy Frank已经获得了15684 x朝鲜蓟心
code> pre>
所以我假设数组将是 p>
$ timeStamp $ userName $ amount $ items
03:12:34 mhopkins321 5瓶水
09:38 :01 Nick Smith 100便士
23:22:59 Fancy Frank 15684朝鲜蓟心
code> pre>
div>
This is a very bad format for machine parsing. Especially problematic is that names may have spaces but are not delimited.
The only foolproof way to parse this is to know all the "garbage text" strings that may appear between the name and the amount. Unless you have a complete list, you may mess up your user names.
It's possible to parse this using explode()
to split a line into an array and then extracting parts. However, I think you should just use a regular expression.
$sample = "
03:12:34 mhopkins321 has acquired 5 x bottles of water
09:38:01 Nick Smith has acquired 100 x pennies
23:22:59 Fancy Frank has acquired 15684 x artichoke hearts
";
$re = '/^(?<timeStamp>[0-9]{2}:[0-9]{2}:[0-9]{2}) # timestamp
\s+
(?<userName>[\w\s]+) # user name
\s+(?:has\s+acquired)\s+ # garbage text between name and amount
(?<amount>\d+) # amount
\s+x\s+ # multiplication symbol
(?<items>.*)\s*$ # item name (to end of line)
/xmu';
preg_match_all($re, $sample, $matches, PREG_SET_ORDER);
var_export($matches);
Looks like you going need a regular expression to split the text line. It is not so easy to understand but a tool you need for other cases like this you related. Manual page: http://br2.php.net/manual/en/book.pcre.php
You need find patterns on the text. For example the timestamp always start at very begin of line and ahs 8 characters in length?