Deformable linear object detection

  • 侧重于图像处理,DLO特征提取方面的论文

2025

HANDLOOM 3.0: Interactive Bi-Directional Cable Tracing Amid Clutter

Justin Yu et al., AUTOLab, ICRA 2025 workshop

GNN Topology Representation Learning for Deformable Multi-Linear Objects Dual-Arm Robotic Manipulation

Alessio Caporali et al., Lar-unibo, IEEE TASE 2025

  • 任务:untangle cluttered DLO
  • 用GNN model从二值mask中提取树状结构的DLO的拓扑表征
  • 存在的问题:
    • 由于算法仅依赖于input mask,所以可能出现inaccuracies或suboptimal topologies
      • two distinct branch sections are merged into a single one
      • a branch-point is not detected
  • 实现过程:
    • 初始化
      • 通过某种语义分割方法(非本文重点Starting from a binary mask)先提取DLO的掩码,文中具体使用深度信息分割背景和目标DLO(本文中实验使用的mask是通过深度点云信息将DLO从平面上分离出来。in this work, a calibrated 3D vision sensor is used to capture a point cloud of the scene)
      • vertices sampling,对二值掩码进行采样,将其散点化为graph nodes
      • edge sampling
    • Topology learning
      • 使用合成的数据集进行训练
      • node和edge的编码方式:
      • GNN的结构:三层GNN
    • Motion primitive?生成geometric路径?
      • ·The pick-and-place trajectories (line 2) are generated by simulating a circular motion of sabove around the fixed center (held by the other arm).

Deformable Linear Objects Segmentation and Estimation for Dual-Arm Robot Cable Manipulation

Bin Cao et al., HIT, IEEE Sensors Journal 2025

  • 用的MobileSAM+postprocessing进行的分割
  • 解决
  • 方法:segmentation (MobileSAM) -> 3D curve-fitting (minimal bending energy) -> Nonrigid Registration (Coherent point drift)
  • Geometry-based:By minimizing the bending energy of the curve, we optimize the positions of the B-spline control points
  • expectation–maximization (EM)

2024

Monocular Estimation of Connector Orietation Combining Deformable Linear Object Priors and Smooth Angle Classification

Caporali et al., University of Bologna, AIM 2024

  • 主要工作:connector pose estimation
  • Backbone使用ResNet
  • 存在的问题
    • DLO prior versus Object Detection,用object detection取代dlo-based prior
    • Generalization to other objects
    • angular discretization step:应该将接头的旋转角度划分至什么精度?

A Robust Deformable Linear Object Perception Pipeline in 3D: From Segmentation to Reconstruction

Sun Zhaole, et al., University of Edinburgh, RAL 2024

  • 代码:TheGoblinTechies/DLO-perception-pipeline: This is a repository for the IEEE RA-L paper A Robust Deformable Linear Object Perception Pipeline in 3D: From Segmentation to Reconstruction
  • 可以处理遮挡问题(因为假设只存在一条DLO)
  • 方法:
    • 先用SAM通过提示词(Though SAM on DLO segmentation with text input performs well in most cases)提取mask 假设图像中只存在一条DLOassuming only one DLO to segment in each image. Detecting multiple DLOs is not our focus.) 后处理
      1. 移除像素数量过少的区域Remove connected components whose area is below a threshold (e.g. 0.005% pixels)
      2. morphological operations根据形态学移除参数不符合要求的部分The allowable number of endpoints is between 0 and 4, and the number of conjunctions is between 0 and 5
      3. 根据宽度移除:估计DLO宽度
    • 使用Grounding-DINO通过输入的提示词和SAM给出的mask
    • 再结合点云进行3D DLO重建(在Keipour的方法的基础上提取特征点extract keypoints,通过Bezier曲线进行初步重建,再通过B-Spline降采样),最后进行DLO smoothing(Discrete Elastic Rod Fitting)
  • 存在的问题:
    • 实时性,10s一帧,设计时未考虑到实时工作
    • 假设图像中只存在一条DLO,未考虑多条DLO

Topology Prediction of Branched Deformable Linear Objects Using Deep Learning

Shenzhe Ouyang et al., University of Stuttgart, IEEE Access 2024

  • 用模型(deep learning+coordinate regression-based 2D spline prediction)预测线缆分支,与YOLOv8-Pose进行对比
    1. 模型架构:Swin-Transformer+Panoptic Feature Pyramid Networks生成热力图,Swin-transformer Panoptic Spline Predictor (SPS model)
    2. For direct topology prediction, the model can learn specific features of a wire harness with a fixed topology
    3. The indirect topology prediction is more versatile and allows the model making predictions based on common features of the individual segments from the target wire harness.
  • 直接和间接的拓扑预测(direct and indirect topology prediction)
  • BDLOs:branched deformable linear objects
  • 具体的数据收集:
    • 采集了四组数据集
      • D1共303张three-segment wire harness的图片,每个segment标定8个关键点
      • D2和D3为eleven-segment,每个segment标定5个点,D2为简单背景463张图片,D3为复杂背景101张图片
      • D4为合成数据

2023

HANDLOOM: Learned Tracing of One-Dimensional Objects for Inspection and Manipulation

Vainavi Viswanath et al., UCB AutoLab, CoRL 2023

DLOFTBS- fast tracking of deformable linear objects with b-splines

P. Kicki, ICRA 2023

A Weakly Supervised Semi-Automatic Image Labeling Approach for Deformable Linear Objects

Alessio Caporali et al., Lar-unibo, University of Bologna, RAL 2023

  • 一种标记DLO用于生成数据库的方法:the goal of this research is to provide a methodology to generate a mix of real-world and synthetic data and show the impact given by mixing the two obtained datasets.

Deformable Linear Objects 3D Shape Estimation and Tracking from Multiple 2D Views

Alessio Caporali et al., Lar-unibo, University of Bologna, RAL 2023

  • 主要工作:摄像头装在机械臂上,提取DLO后估计其延伸方向,决定机械臂运动方向使得DLO在相机视角正中
  • 存在的问题:
    • 只处理静态画面deals with static scenes, i.e. the DLOs are still between the image acquisitions, devoted to addressing dynamic scenes
    • 处理多DLO场景and can be susceptible to the quality of the extracted splines.increasing robustness in cluttered conditions
  • 单一图像提取DLO:使用FastDLO
  • 拟合提取后的DLO :cubic B-spline
  • 视角的选取(采用跟踪tracking的思路):先对当前图像中的DLO进行估计,然后沿着DLO的延伸方向(principal direction)运动,将其放在视角正中

    In particular, after the estimation of a given section of the DLO, the camera is moved forward along the DLO principal direction and centered with respect to the estimated points. in order to increase the portion of the same DLO visible in every sample, it is assumed to have the camera oriented along the DLO main axis and to record the samples by sliding orthogonally to it.

Learning to Estimate 3-D States of Deformable Linear Objects from Single-Frame Occluded Point Clouds

Kangchen Lv, et al., Tsinghua University, ICRA 2023

  • 单帧点云作为输入,估计DLO的3D state
  • 并行采用两种方法End-to-End regression和Point-to-Point voting

RT-DLO: Real-time deformable linear objects instance segmentation

Alessio Caporali et al., Lar-unibo, IEEE Trans. Ind. Informat. 2023

mBEST: Realtime deformable linear object detection through minimal bending energy skeleton pixel traversals

Andrew Choi et al., UCLA, RAL 2023

  • 纯perception,主要解决视觉中线材交叉的问题
  • 7步:
    • DLO Segmentation:使用DCNN模型+color filtering (we use FASTDLO’s pretrained DCNN model)
    • Skeletonization
    • Keypoint Detection
    • Split End Pruning
    • Intersection Clustering, Matching, and Replacement
    • Minimal Bending Energy Path Generation
    • Crossing Order Determination

TrackDLO: Tracking deformable linear objects under occlusion with motion coherence

Jingyi Xiang et al., UIUC, RAL 2023

  • 代码:RMDLO/trackdlo
  • key node based + Motion Coherence Theory
  • 并非使用DLO跟踪信息去对manipulation进行规划,而是根据DLO的运动更好的对其进行跟踪
    • 主要贡献:
      • 实时跟踪DLO,不需要建模、仿真、visual marker或者接触
      • 无需建模可以处理self-occluding
      • This is achieved through the application of Motion Coherence Theory to impute the spatial velocity of occluded nodes, the use of the topological geodesic distance to track self-occluding DLOs, and the introduction of a non-Gaussian kernel that only penalizes lower-order spatial displacement derivatives to reflect DLO physics.
  • 处理tip occlusion,mid-section occlusion以及self-occlusion,实时追踪
  • 改进的Coherent Point Drift (CPD)算法(基于Gaussian Mixture Model (GMM)聚类?)

2022

Deformable One-Dimensional Object Detection for Routing and Manipulation

Azarakhsh Keipour et al., CMU & Google, RAL 2022

  • 几何模型:a chain of fixed-length cylindrical segments connected with passive spherical joints
  • 通过六步实现:segmentation(并没有具体实现,借用其它解决方案) -> Topological Skeletonization -> Contour Extraction -> DOO Fitting -> Pruning -> Merging

FASTDLO: Fast Deformable Linear Objects Instance Segmentation

Alessio Caporali et al., RAL2022

FASTDLO focuses directly on solving the intersection areas of the image between multiple DLOs to distinguish the instances, increasing the speed and accuracy of the results.

  • 未使用SLIC(实现superpixel的算法)?
  • 六个步骤
    1. Background segmentation 通过DCNN进行segmentation,输出binary mask 通过Blender渲染了32,000张图片,选择**DeeplabV3+**作为DCNN进行图像分割
    2. Skeleton pixels classification 从生成skeleton,对local neighborhoods的像素进行分类
    3. Segments generation 找出产生交叉的位置
    4. Intersections processing A shallow neural network to predict connection probabilities among endpoint-pairs
    5. Informed merging
    6. Intersections layout
  • It (Ariadne+) employs the same segmentation network architecture of the one introduced in Section III-A (background segmentation) to distinguish the DLOs from the scene

配置与运行

  • 依赖的安装参见Ariadne+的配置,此外需要额外安装shapely
    • pip3 install wheel shapely arrow termcolor matplotlib
  • 安装:在根目录下运行pip install .
  • 测试:可以在克隆项目后在根目录下新建”weights”文件夹,注意是复数,用来存放网络参数 run.py为例程,读取weights的目录和图片都是在这个源代码中配置的
  • fastdlo/proc/utils.py第50行,改为np.int64

Tracking fast trajectories with a deformable object using a learned model

J. A. Presiss, et al., ICRA 2022


Ariadne+: Deep Learning-based Augmented Framework for the Instance Segmentation of Wires

Alessio Caporali et al., IEEE Trans on Industrial Informatics 2022

Ariadne+ [10] was recently introduced as an improved version of Ariadne concerning several aspects: better accuracy and efficiency, ability to consider even more complex scenarios in which the endpoints of the cable were not present in the image. Currently, Ariadne+ represents the state of the art in terms of DLOs instance segmentation. However, its throughput is limited to a few FPS providing a strong limiting factor for its applicability on real-world applications.

  • 步骤:
    • Semantic Segmentation:使用DeeplabV3+,ResNet-101作为backbone进行semantic segmentation,使用30K的样本进行了训练。训练配置为200 epochs, batch size 10, output stride 16 It consists of around 30K samples of synthetic images of wires of different shapes and colors obtained using a novel approach based on a Chroma-Key method with background swapping
    • Superpixel segmentation 选用的算法为SLIC: Simple Linear Iterative Clustering
    • Graph Generation 在superpixel segmenation的基础上创建一个无向无权重的RAG: Region Adjacency Graph
    • Graph Simplification
    • Graph Clustering
    • Intersection Score Evaluation 使用DCNN(TripleNet)处理交叉点In case intersection nodes are present, the scores of their neighbours couples are evaluated using a DCNN called TripletNetResNet18+一个全连接层
    • Path Finder
    • Paths Layout Inference 使用DCNN(CrossNet)处理两根线交叉的情况中的上下关系,也是ResNet18+一个全连接层
    • B-Spline Modelling:最终用cubic B-Spline逼近计算出的path上的nodes
  • 网络训练细节:
    • Segmentation网络的训练 DeepLabV3+ is trained with a ResNet-101 backbone for 200 epochs, with batch size 10, output stride 16, separable convolutions, using Adam for the optimization and employing a polynomial learning rate adjustment policy starting from 10−6 to a minimum of 10−9, with power 0.95. The training dataset is obtained from 90% of the electric wires synthetic dataset (Sec. III-A), while the validation is done on the remaining 10%. The data augmentation scheme includes hue randomization, channel shuffling, flipping and finally resizing (360 × 640). The early stopping is configured to end the training process when the validation loss does not decrease for 5 epochs in a row.
    • TripleNet TripletNet, that is described in Sec. III-F, is trained for 100 epochs, with batch size 256, using Adam for the optimization and applying a learning rate of 10−4. The dataset used for the training is composed of around 3000 samples organized offline into 4500 different triplets. The usual split of 90-10 for training and validation is used. The data augmentation includes hue, saturation and value randomization plus channel shuffling. An early stopping strategy is employed by monitoring the validation loss with a patience of 10 epochs.
    • CrossNet CrossNet, described in Sec. III-H, is trained in a way similar to the previous network. Hence, for 100 epochs, with batch size 256, using Adam for the optimization and applying a learning rate of 5×10−5. In this case, the dataset is made of around 1500 samples equally divided between the two classes and it is split in the 90-10 way. As data augmentation, alongside hue, saturation, value randomization and channel shuffling, flipping and random brightness and contrast are employed. Also, during this training, an early stopping strategy is employed with a patience of 10 epochs.

配置与运行

  • 源代码code
  • 环境配置:
    • pip3 install arrow
    • pip3 install termcolor
    • pip3 install igraph
    • pip3 install scikit-image
    • pip3 install pytorch-lightning
      • 1.7.1可以用(之后需将pytorch强制降级为1.8.2)
      • 或者安装1.6.5
    • Pytorch version: 1.8.2可以运行
      • CPU版本
        • pip3 install torch==1.8.2 torchvision==0.9.2 torchaudio===0.8.2 --extra-index-url https://download.pytorch.org/whl/lts/1.8/cpu
      • CUDA 11版本
        • pip3 install torch==1.8.2 torchvision==0.9.2 torchaudio==0.8.2 --extra-index-url https://download.pytorch.org/whl/lts/1.8/cu111
      •  force-install PyTorch 1.8 after installing PL
    • RLException: Roslaunch got a 'No such file or directory' error while attempting to run...
      • 解决方案:run_ariadne_service.py,修改第一行为#!/usr/bin/python3

2021

3D DLO Shape Detection and Grasp Planning from Multiple 2D Views

Alessio Caporali et al., IEEE/ASME AIM 2021

  • In this paper:

An algorithm called ARIADNE [DeGregorio_ACCV2018] is used to detect the DLOs from each input camera image through deep-learning based segmentation and providing a B-Spline representation of the DLO in pixel coordinates.

Through deep-learning based segmentation? 应该是指的用YOLO找出线缆末端

  • 步骤:
    • **DeepLabV3+**用来移除背景进行二值化,生成binary mask,训练集为合成数据

    • 对binary mask使用SLIC进行超像素化

      • 为啥对二值化之后的图像再进行超像素化?

      The idea of superpixels is to partition the image into local meaningful areas making the further processing easier and faster. In particular, a modified version, called MaskSlic [13], that is capable of applying the superpixel segmentation only on a region of interest (i.e. the foreground pixels) is used, due to the availability of the binary mask previously computed. The application of MaskSlic allows us to exploit completely the result of the initial semantic segmentation.

    • 生成region adjacency graph(RAG)

Tracking Partially-Occluded Deformable Objects while Enforcing Geometric Constraints

Yixuan Wang et al., Umich, ICRA 2021


2019

Multi-view Reconstruction of Wires using a Catenary Model

Ratnesh Madaan et al., ICRA 2019

  • model based

Detection and Reconstruction of Wires Using Cameras for Aircraft Safety Systems

Adam Stambler et al., ICRA 2019

  • CNN

2018

Let’s Take a Walk on Superpixels Graphs: Deformable Linear Objects Segmentation and Model Estimation

Daniele De Gregorio et al., ACCV 2018

  • Source code
  • over-segmentation of the source image into superpixels to buid a Region Adjacency Graph (RAG)
  • DLO检测的六个步骤: 0. Endpoints检测,交由外部的算法完成 Fine-tuned YOLOv2 model pre-trained on ImageNet based on Electrical Cable Dataset
    1. Superpixel Segmentation:将原始图片分割为相邻子区域(adjacent sub-regions, superpixels),从而极大的减少了搜索时的分区数量(相比像素图)。并在此基础上由超像素间的相邻关系建立邻接图(adjacency graph),步骤0中检测出的端点作为种子点seeds
    2. Start walks:从种子点出发,沿邻接图搜索(创建一系列搜索方向?)
    3. Extend walks:从当前点出发抵达下一个最优的相邻超像素
    4. Terminate walks
    5. Discard Unlikely Walks
  • 超像素分割的方法:本文使用SLIC(SLIC super-pixel compared to state-of-the-art superpixel methods)
    • Segmentation with high compactness

      During the clustering process the compactness of each cluster can be either increased or reduced to the detriment of visual similarity. In other words we can choose easily to assign more importance to visual consistency of superpixels or to their spatial uniformity.

    • Search

      similar to Region Growing algorithm. restrict the search along a walk applying several model-base constraints rather than relying only on visual similarity only

      • Visual likelihood:使用Color Histogram,HSV色彩空间
      • Curvature likelihood
      • Distance likelihood
      • Estimation of the most likely walk

DROAN - Disparity-Space Representation for Obstacle Avoidance: Enabling Wire Mapping amp; Avoidance

Geetesh Dubey et al., IROS 2018

  • CNN

2017

Fast, accurate thin-structure obstacle detection for autonomous mobile robots (2017)

Chen Zhou et al., CVPR workshop 2017

  • 要解决的问题:detect thin-structure obstacle (wires, cables, tree branches)

  • 解决方案:edge-based visual odometry techniques Perform thin obstacle detection using video sequences from a monocular camera or a stereo camera pair

  • 具体方案

    1. Edge extraction: DoG (Difference of Gaussians) detector for edge detection + a consequent Canny-style hypothesis linking step
      1. 选择DoG的原因:its good repetitivity
      2. 使用hypothesis linking step的原因:improve the recall of weak edges which are common on thin obstacles
    2. Edge 3D reconstruction
      • Monocular camera: VO (Visual Odometry) algorithm Incorporating IMU data
      • Stereo camera Edge-based stereo matching
    3. Obstacle labeling on edge depth maps