翻译一个系列的组成部分
我有一堆,我从一个维基移动文件(基于降价)到另一个(克里奥尔语为基础)。我写了一对夫妇的sed脚本的东西如转换链接的格式和报头格式。但新的wiki允许的目录结构,我宁愿使用比伪目录结构我现在有。我已经重命名的文件,但我需要所有的链接转换 _
分隔为 /
分隔。
I have a bunch of files that I am moving from one wiki (Markdown based) to another (Creole based). I have written a couple of sed scripts to things such as convert link formats and header formats. But the new wiki allows a directory structure and I would rather use that than the pseudo-directory structure I have now. I have already renamed the files, but I need to convert all of the links from _
delimited to /
delimited.
基本信息:
Creole link: [[url]] [[url|name]]
我只是想不包含的链接转换。
或 /
。
我真的AP preciate如果你解释什么命令你给的手段,这样我可以借鉴一下。
I would really appreciate if you explained what the command you give means so that I can learn from it.
this is a line with a [[Link_to_something]] and [[Something_else|something else]]
this site is cool [[http://example.com/this_page]]
到
this is a line with a [[Link/to/something]] and [[Something/else|something else]]
this site is cool [[http://example.com/this_page]]
我曾尝试
///Ÿ
仅适用于整条生产线。
取值// \\ U \\ 2
只支持情况下的翻译。
s//\u\2
only supports case translations.
我想我会用Perl。这是可以做到的一个班轮,这样的:
I think I'd use Perl. It can be done as a one-liner, thus:
perl -pe 's{\[\[([^/.|]+)(|[^]]+)?\]\]}{$x=$1;$y=$2;$x=~s%_%/%g;"[[$x$y]]"}gex;' <<'EOF'
this is a line with a [[Link_to_something]] and [[Something_else|something else]]
this site is cool [[http://example.com/this_page]]
EOF
从输出是:
this is a line with a [[Link/to/something]] and [[Something/else|something else]]
this site is cool [[http://example.com/this_page]]
这是否是好的作风等是完全开放的辩论。
Whether that's good style etc is entirely open to debate.
我会解释这个版本的code,这是同构与code以上的:
I'll explain this version of the code, which is isomorphic with the code above:
perl -e 'use strict; use warnings;
while (my $line = <>)
{
$line =~ s{ \[\[ ([^/.|]+) (|[^]]+)? \]\] }
{ my($x, $y) = ($1, $2); $x =~ s%_%/%g; "[[$x$y]]" }gex;
print $line;
} '
的,而
循环基本上是在 -p
提供了第一个版本。我明确指定的输入变量 $行
,而不是使用隐含 $ _
作为第一个版本。我也不得不宣布 $ x的
和 $ Y
因为使用严格的;使用警告;
The while
loop is basically what the -p
provides in the first version. I've explicitly named the input variable as $line
instead of using the implicit $_
as in the first version. I also had to declare $x
and $y
because of the use strict; use warnings;
.
替换命令的形式为取值{模式} {替换}
,因为有在正则表达式本身斜线。在 X
修改允许在两个部分,这使得它更容易铺陈(非显著)的空间。在先按g
修改重复替换尽可能多的模式匹配。在电子
修改写着治疗替代的右侧部分为前pression。
The substitute command takes the form s{pattern}{replace}
because there are slashes in the regexes themselves. The x
modifier allows (non-significant) spaces in the two parts, which makes it easier to lay out. The g
modifier repeats the substitution as often as the pattern matches. The e
modifier says 'treat the right-hand part of the substitution as an expression'.
匹配模式寻找一对开放的方括号,然后记得比 /
,。$ C $其他的字符序列C>或
|
,后面可以跟一个 |
,比其他的字符序列]
,在一对亲密的方括号的整理。这两个捕获是 $ 1
和 $ 2
。
The matching pattern looks for a pair of open square brackets, then remembers a sequence of characters other than /
, .
or |
, optionally followed by a |
and a sequence of characters other than ]
, finishing at a pair of close square brackets. The two captures are $1
and $2
.
替换前pression节省 $ 1
和 $ 2
变量 $ X
和 $ Y
。然后,它采用X 一个简单的替换来 $,改变下划线成斜线。那么结果值为
的字符串[$ X $ Y]
。您不能修改 $ 1
或 $ 2
直接在更换前pression。和内 S%_%/%克;
则会覆盖 $ 1
和 $ 2
,这就是为什么我需要 $ X
和 $ Y
。
The replacement expression saves the values of $1
and $2
in variables $x
and $y
. It then applies a simpler substitution to $x
, changing underscores into slashes. Then the result value is the string of [[$x$y]]
. You can't modify $1
or $2
directly in the replacement expression. And the inner s%_%/%g;
clobbers $1
and $2
, which is why I needed $x
and $y
.
有可能是另一种方式来做到这一点 - 这是Perl的,所以TMTOWTDI:不止一种方法去做一件事。但这并至少工作。
There might be another way to do it - this is Perl, so TMTOWTDI: there's more than one way to do it. But this does at least work.