使用随机森林实现OSM路网城市多车道信息提取

Multilane roads extracted from the OpenStreetMap urban road network using random forests.,DOI:10.1111/tgis.12514.

https://www.baidu.com/s?wd=%E4%BD%BF%E7%94%A8%E9%9A%8F%E6%9C%BA%E6%A3%AE%E6%9E%97%E5%AE%9E%E7%8E%B0OSM%E8%B7%AF%E7%BD%91%E5%9F%8E%E5%B8%82%E5%A4%9A%E8%BD%A6%E9%81%93%E4%BF%A1%E6%81%AF%E6%8F%90%E5%8F%96&rsv_spt=1&rsv_iqid=0x864e4cd80000ce4f&issp=1&f=8&rsv_bp=1&rsv_idx=2&ie=utf-8&tn=baiduhome_pg&rsv_enter=1&rsv_sug3=68&rsv_sug1=18&rsv_sug7=101&rsv_sug2=0&inputT=36415&rsv_sug4=75202

OSM辅助的车载激光点云道路三维矢量边界提取：https://www.hanspub.org/journal/PaperInformation.aspx?paperID=24702

osmnx初探：https://blog.csdn.net/qq_32002189/article/details/88687028

城市道路网主干道的多边形提取方法：http://www.docin.com/p-775607614.html

1. High-precision 3D road information plays an important role in intelligent transportation, urban planning and management. The mobile laser scanning system can quickly obtain the 3D information of the street scene, but it is difficult to directly extract the complete and accurate road boundary from the original point cloud due to the large amount of data, occlusion and complicated urban street scenes. OpenStreetMap is a kind of crowd source geographic data. It can be used to assist road extraction of mobile laser point clouds. This paper proposes a road 3D boundary extraction algorithm that integrates two-dimensional vector data OpenStreetMap and vehicle-borne laser point cloud data. Firstly, the point cloud feature map is constructed by analyzing the spatial distribution characteristics of the Scanning points. The OSM provides the initial position, and then the road boundary extraction is performed on the feature map of the point cloud by the improved active contour model. We use StreetMapper data to carry out experiments. The results show that the proposed algorithm can repair the lack of boundary information caused by point cloud defects, and accurately and completely extract road three-dimensional boundary information, which proves strong robustness and applicability.

高精度三维道路信息在智能交通、城市规划和管理中具有重要作用。移动激光扫描系统可以快速获取街景的三维信息，但由于数据量大、遮挡多、城市街景复杂，很难从原始点云直接提取完整、准确的道路边界。openstreetmap是一种人群源地理数据。它可用于辅助移动激光点云的道路提取。本文提出了一种结合二维矢量数据、开放式街道地图和车载激光点云数据的道路三维边界提取算法。首先，通过分析扫描点的空间分布特征，构建了点云特征图。OSM提供初始位置，然后利用改进的活动轮廓模型对点云特征图进行道路边界提取。我们使用Streetmapper数据进行实验。结果表明，该算法能够修复点云缺陷造成的边界信息缺失，准确、完整地提取道路三维边界信息，具有较强的鲁棒性和适用性。

使用随机森林实现OSM路网城市多车道信息提取

The volunteered geographic information (VGI) collected in OpenStreetMap (OSM) has been used in many applica‐ tions. Extracting multilane roads and establishing a high level of expressed detail play important roles in the field of automated cartographic generalization. An accurate and detailed extraction process benefits geographic analysis, urban region division, and road network construction, as well as transportation applications services. The road net‐ works in OSM have a high level of detail and complex structures; however, they also include many duplicate lines, which degrade the efficiency and increase the diffi‐ culty of extracting multilane roads. To resolve these prob‐ lems, this work proposes a machine‐learning‐based approach, in which the road networks are first converted from lines to polygons. Then, various geometric descrip‐ tors, including compactness, width, circularity, area, pe‐ rimeter, complexity, parallelism, shape descriptor, and width‐to‐length ratio, are used to train a random forest (RF) classifier and identify the candidates. Finally, another RF is trained to evaluate the candidates using all the geo‐ metric descriptors and topological features; the outputs of this second trained RF are the predicted multilane roads. An experiment using OSM data from Beijing, China vali‐ dated the proposed method, which achieves a highly ef‐ fective performance when extracting multilane roads from OSM

OpenStreetmap（OSM）中收集的志愿地理信息（VGI）已在许多应用程序中使用。提取多车道道路，建立高层次的表达细节，在地图自动综合领域发挥着重要作用。准确而详细的提取过程有利于地理分析、城市区域划分、路网建设以及交通应用服务。OSM中的道路网工程具有高度的细节和复杂的结构；但是，它们还包括许多重复的线路，这会降低效率并增加提取多车道道路的难度。为了解决这些问题，本文提出了一种基于机器学习的方法，其中道路网络首先从直线转换为多边形。然后，使用各种几何描述工具（包括紧凑性、宽度、圆度、面积、周长、复杂度、平行度、形状描述符和宽长比）训练随机森林（RF）分类器并识别候选对象。最后，对另一个RF进行培训，以使用所有地理度量描述符和拓扑特征评估候选；第二个经过培训的RF的输出是预测的多车道道路。利用中国北京的OSM数据进行的一项实验验证了该方法的有效性，该方法在从OSM中提取多车道道路时具有很好的效果。

1 | INTRODUCTION

As information technology has improved, cartography has largely switched from digitization to informatization, and has begun to focus on automatic mapping requirements, including multiscale expressions of spatial data in geographic information science (GIS), series scale‐map production, updating multiscale geospatial databases, and so on. This process is termed “smart cartography” and has been widely researched (Wang, 2010). One hot re‐ search topic is the ability to automatically derive small‐scale road networks from large‐scale road networks, which form the most important feature on many maps. Multiscale road network cartography lies at the core of—and is a key aspect of—many analysis and application studies. As multilane roads play an important role in city road network transportation patterns from fine to coarse‐grained level, their functional hierarchy is crucial (Heinzle & Anders, 2007; Heinzle, Anders, & Sester, 2006; Zhang, 2004)

随着信息技术的不断完善，地图学已经从数字化转向信息化，并开始关注自动制图的需求，包括地理信息科学（GIS）中空间数据的多尺度表达、系列比例尺地图的制作、多尺度地理空间数据库的更新等。这一过程被称为“智能地图学”，并已被广泛研究（王，2010年）。一个热门的重新搜索主题是能够从大型道路网络自动派生小型道路网络，这是许多地图上最重要的功能。多尺度路网地图学是许多分析和应用研究的核心和关键。由于多车道公路在城市道路网从细到粗运输模式中起着重要作用，因此其功能层次至关重要（Heinzle&Anders，2007；Heinzle，Anders&Sester，2006；Zhang，2004）。

In recent years, volunteered geographic information (VGI) such as the OpenStreetMap (OSM) project has been widely used for updating spatial databases, in spatial analysis, and in many other applications (Xu, Chen, Xie, & Wu, 2017) because every user can become a contributor (Goodchild, 2007; Li & Qian, 2010). The development of global positioning system (GPS) devices, which can acquire personal geographical location information (Zou, Yu, & Cao, 2017), has conveniently allowed highly detailed OSM road network data to be obtained easily. Multiscale expres‐ sions of road networks and the production of multiscale maps have engendered many new research opportunities. Such studies are helpful in studying the automatic synthesis of road networks and in improving the production of map data. The wiki of OSM has defined a tag of “lanes” to specify how many traffic lanes are on a highway. However, most road layers lack the tag, and OSM road network data have almost no clear indication of multilane road properties; thus, it is of limited use for research on the functional levels of roads. The goal of this study was to extract multilane roads from OSM urban road networks. This study was undertaken for the following reasons:

近年来，志愿地理信息（vgi）如openstreetmap（osm）项目被广泛用于更新空间数据库、空间分析以及许多其他应用程序（Xu、Chen、Xie和Wu，2017），因为每个用户都可以成为贡献者（Goodchild，2007；Li和Qian，2010）。全球定位系统（GPS）设备的开发，可以获取个人地理位置信息（邹、俞、曹，2017），方便地获得高度详细的OSM道路网络数据。道路网的多尺度扩展和多尺度地图的制作带来了许多新的研究机会。这些研究有助于研究道路网的自动综合，提高地图数据的生成。OSM的wiki定义了一个“车道”标签，用于指定一条公路上有多少条车道。然而，大多数道路层缺乏标记，OSM道路网络数据几乎没有明确的多车道道路特性指示，因此，它在道路功能水平研究中的应用有限。本研究的目的是从OSM城市道路网中提取多车道道路。本研究的开展原因如下：

1. The multilane roads in an urban road network form a framework for the construction of urban road networks. Generally, the multilane roads in urban road networks have high traffic capacity and represent the urban traffic flow model. Thus, analyzing the traffic flow of multilane roads is very important when constructing urban road networks

1. 城市道路网中的多车道道路构成了城市道路网建设的框架。通常，城市道路网中的多车道道路具有较高的通行能力，代表了城市交通流模型。因此，分析多车道公路的交通流在城市道路网建设中具有十分重要的意义。

2. High level of detail (LoD) data concerning urban road networks are required when building road network data‐ bases. The data quality of multilane roads is directly related to the data quality of road networks data at differ‐ ent scales, which affects the effect of multiscale map expression. Therefore, it is important to study the most appropriate way to extract the multilane roads from a road network to establish an application database.

2. 在建立道路网络数据库时，需要有关城市道路网络的高详细程度（LOD）数据。多车道公路的数据质量直接关系到不同尺度上路网数据的数据质量，从而影响多尺度地图表达的效果。因此，研究从道路网络中提取多车道道路的最合适方法，建立应用数据库具有重要意义。

3. The multilane roads of an urban road network play important roles in geographical analysis, traffic analysis, traf‐ fic application services, and so on. The multilane road network also plays an important role in building urban road network models, as well as at the function level.

3. 城市道路网的多车道道路在地理分析、交通分析、交通应用服务等方面发挥着重要作用。多车道公路网在城市道路网模型的建立和功能层面上也发挥着重要作用。

This study extracted multilane roads from the OSM road network using a random forest (RF)‐based method. Most of the multilane roads in a city are expressed by multiple lanes, which can be considered as several closed polygons constructed by their intersecting points. Therefore, multilane roads can be extracted using polygon analysis tech‐ niques (Li, Fan, Luan, Yang, & Liu, 2014), and this study proposes a polygon‐based intelligent extraction method for the multilane roads of urban road networks. The proposed method in this article uses more effective shape descriptors for circularity, complexity, and compactness to describe the multilane polygons. By combining these shape descriptors with the topological characteristics of polygons between roads, some candidate polygons are evaluated by another trained RF. This method is both highly feasible and introduces no loss of precision, making it a significant step in im‐ proving and optimizing road networks.

本研究使用基于随机森林（RF）的方法从OSM道路网络中提取多车道道路。城市中大多数多车道道路都是用多车道表示的，多车道可以看作是由交叉点构成的多个封闭多边形。因此，可以使用多边形分析技术（Li、Fan、Luan、Yang和Liu，2014）提取多车道道路，本研究提出了一种基于多边形的城市道路网络多车道道路智能提取方法。本文提出的方法使用更有效的形状描述符来描述多平面多边形的圆度、复杂性和紧凑性。通过将这些形状描述符与道路间多边形的拓扑特征相结合，用另一个经过训练的RF对一些候选多边形进行评估。该方法既具有很高的可行性，又不会造成精度损失，这使得它成为改进和优化道路网络的一个重要步骤。

The remainder of this article is organized as follows. Section 2 provides an overview of prior work related to this study. Section 3 describes the method for extracting multilane roads using the RF in detail. Section 4 de‐ scribes and discusses the experimental results and Section 5 presents concluding remarks.

本文的其余部分组织如下。第2节概述了与本研究相关的前期工作。第3节详细介绍了使用射频提取多车道道路的方法。第4节描述并讨论了实验结果，第5节给出了结论性评论。

2 | REL ATED WORK

Road network synthesis is an important research field in cartography, and considerable research has been con‐ ducted on matching, recognizing, and extracting roads (Kuntzsch, Sester, & Brenner, 2016; Volker & Fritsch, 1999; Xiong, 2000). Regarding extraction methods for multilane roads, numerous approaches exist, including manual, semi‐automated, and fully automated. In the early stage, road‐level attributes were used as the extraction metric (Wang, 1994); however, this method is limited by factors such as data quality and data providers’ expertise. Some scholars have proposed the concept of a “stroke,” which is defined as a road that is connected, unbranched, and coherent; subsequently, multilane roads can be selected according to the stroke order (Thomson, 2006; Thomson & Richardson, 1999; Yang, Luan, & Li, 2011). The stroke value can be calculated by multiple attributes such as road length (Chaudhry & Mackaness, 2005), connectivity between strokes (Zhang, 2005), and so on (Jiang & Claramunt, 2004). Indeed, the stroke concept is an effective structural model that allows road network analysis based on the importance of every road path, even without other information (Mackaness, Ruas, & Sarjakoski, 2011). However, the stroke concept does not consider spatial topology; therefore, it can be accurate only at the local level. In recent years, several methods have been proposed for extracting road networks based on their geometric features, topo‐ logical relations, and spatial distribution characteristics (Guo, Qian, Huang, He, & Liu, 2014; He, Qian, Liu, Wang, & Hu, 2015). Among these, some have introduced intelligent algorithms, including a case study approach (Guo et al., 2014), a method that used the genetic algorithm (Wang & Deng, 2005), and another that used a neural network (Balboa & López, 2008; Zhou & Li, 2014). The case study methodology simplifies the complex extraction process but depends highly on an expert case library. The genetic algorithm is time‐consuming and the genetic model can experience convergence problems. Although the intelligent methods used to analyze road networks each have their own advantages and disadvantages, with further research and scientific and technological advances, these methods will become increasingly perfected.

道路网综合是地图学的一个重要研究领域，在匹配、识别和提取道路方面进行了大量的研究（Kuntzsch、Sester和Brenner，2016；Volker和Fritsch，1999；Xiong，2000）。对于多车道公路的提取方法，存在许多方法，包括手动、半自动和全自动。在早期阶段，道路等级属性被用作提取指标（Wang，1994）；但是，这种方法受到数据质量和数据提供商专业知识等因素的限制。一些学者提出了“中风”的概念，即一条连接、不分叉、连贯的道路；随后，可以根据中风顺序选择多车道道路（Thomson，2006；Thomson&Richardson，1999；Yang、Luan和Li，2011）。行程值可以通过多种属性计算，如道路长度（Chaudhry&Mackanes，2005年）、行程之间的连通性（Zhang，2005年）等（Jiang&Claramunt，2004年）。事实上，中风概念是一种有效的结构模型，它允许基于每条道路重要性的道路网络分析，即使没有其他信息（Mackanes、Ruas和Sarjakoski，2011年）。但是，笔画概念不考虑空间拓扑，因此只能在局部级别上进行精确计算。近年来，根据路网的几何特征、地形关系和空间分布特征，提出了几种提取路网的方法（郭、钱、黄、何、刘，2014；何、钱、刘、王、胡，2015）。其中，一些人介绍了智能算法，包括案例研究方法（Guo等人，2014年）、使用遗传算法的方法（Wang&Deng，2005年）和使用神经网络的方法（Balboa&L_Pez，2008年；Zhou&Li，2014年）。案例研究方法简化了复杂的提取过程，但高度依赖于专家案例库。遗传算法耗时，遗传模型存在收敛问题。虽然分析路网的智能方法各有优缺点，但随着研究的深入和科学技术的进步，这些方法将越来越完善。

Some studies of multilane road extraction are based on lines—parallel lines, which in proximity are defined as multilane roads when they exhibit the appropriate angles, lengths, and distances—they are connected by grow‐ ing a buffer to generate the road network (Yang et al., 2011; Zhang, 2009). However, because some VGI data are of poor quality, such as the road network data in OSM, it is both time‐consuming and error‐prone to extract multilane roads using only lines (Li et al., 2014). Fortunately, a new approach based on polygon analysis has been proposed (Li et al., 2014), which converts road lines to polygons to better describe the road network. That study used a support vector machine (SVM) to classify multilane roads. Polygon analysis is a better approach for solving the poor‐data problem of VGI data, but it requires capable polygon shape descriptors and an effective method to determine the polygons that represent multilane roads (Li et al., 2014).

对多车道道路提取的一些研究是基于平行线，这些平行线在附近被定义为多车道道路，当它们显示出适当的角度、长度和距离时，它们通过增加缓冲区来生成道路网络而相互连接（Yang等人，2011；Zhang，2009）。但是，由于某些VGI数据质量较差，例如OSM中的道路网络数据，因此仅使用线路提取多车道道路既耗时又容易出错（Li等人，2014）。幸运的是，提出了一种基于多边形分析的新方法（Li等人，2014），该方法将道路线转换为多边形，以更好地描述道路网络。该研究使用支持向量机（SVM）对多车道道路进行分类。多边形分析是解决VGI数据差数据问题的较好方法，但它需要有能力的多边形形状描述符和确定代表多车道道路的多边形的有效方法（Li等人，2014）。

In contrast to the abovementioned studies, and by taking full advantage of the polygon analysis method, we aim to extract the multilane roads from OSM data using a machine learning RF‐based approach. In this study, polygon circularity, parallelism, and width are defined, and shape descriptors are extracted using discrete Fourier transforms. Combined with some other geometric features such as compactness, circularity, perimeter, and com‐ plexity, these data form the input to one RF that extracts first‐stage candidates. Then, a second RF evaluates the first‐stage multilane road candidate polygons to generate the final set of multilane roads. The model is trained using the input dataset by adding the proposed topological relationships, including topological intensity and topo‐ logical connections based on the candidates.

与上述研究相比，通过充分利用多边形分析方法，我们旨在使用基于机器学习的射频方法从OSM数据中提取多车道道路。本文定义了多边形的圆度、平行度和宽度，并利用离散傅立叶变换提取了形状描述符。这些数据与一些其他几何特征（如紧凑性、圆度、周长和复杂度）相结合，形成对一个射频的输入，该射频提取第一阶段的候选对象。然后，第二个RF评估第一阶段的多车道道路候选多边形，以生成最终的多车道道路集。使用输入数据集，通过添加所提议的拓扑关系（包括基于候选对象的拓扑强度和拓扑连接）来训练模型。

3 | METHODOLOGY

A road network is composed of lines. The complex topological relations between the road segments allow the entire road network to be regarded as a group of polygons. The multilane roads in these road networks always contain some parallel lines; therefore, the polygons that describe multilane roads can be recognized by these features (Figure 1). The approach used in this article attempts to find some geometric and topological descriptors for the polygons; then, the RFs are applied to perform a binary classification of the polygons into either multilane roads or not multilane roads.

道路网由线组成。路段之间的复杂拓扑关系允许将整个路网视为一组多边形。这些道路网络中的多车道道路总是包含一些平行线；因此，这些特征可以识别描述多车道道路的多边形（图1）。本文所用的方法试图找到一些多边形的几何和拓扑描述符；然后，应用RFS对多边形进行二元分类，将其划分为多车道公路或非多车道公路。

3.1 | Data preprocessing

OSM is a free worldwide vector map dataset created by volunteers from all over the world; consequently, some volunteers lack professional training, and the OSM dataset includes several problems in terms of both data quality and data availability. First, some road data are repeatedly created by different volunteers; thus, repeated lines may exist in the OSM data which lack professional checking. Second, some of the contributions by non‐professional volunteers may be incorrect (Goodchild & Li, 2012). For example, there are some unreasonable angles between lines, disconnected lines, even entangled lines. It is impossible for all these cases to exist. There is no multilane road attribute in the OSM road network data; therefore, we cannot simply extract the multilane roads in the urban road network based on pre‐existing attributes. Instead, we must analyze the characteristics of the road network data carefully and perform high‐quality processing of the original OSM road network data to obtain processed data that meets the requirements of the method studied in this article.

使用随机森林实现OSM路网城市多车道信息提取

相关推荐