首页 \ 问答 \ 使用单个类名从html标签获取文本,html标记将包含多个类(Get text from html tag using single class name, the html tag will contain multiple class)

使用单个类名从html标签获取文本,html标记将包含多个类(Get text from html tag using single class name, the html tag will contain multiple class)

我有一个html行,其中标签内有标签,单个标签我包含多个类。 我需要用单个类名提取文本(我只知道一个类名)

<p class="Body1"><span class="style3"></span><span class="style1">W</span><span class="Allsmall style5">extract this text </span><span class="style5">unwanted text </span></p>

我知道单独的类名Allsmall我想用java中的Jsoup从html行中提取文本“extract this text”。


I have a html line where there are tags inside tags, a single tag my contain multiple class. I need to extract the text with single class name(i know only one class name)

<p class="Body1"><span class="style3"></span><span class="style1">W</span><span class="Allsmall style5">extract this text </span><span class="style5">unwanted text </span></p>

I know the class name Allsmall alone i want to extract the text "extract this text" from the html line using Jsoup in java.


原文:https://stackoverflow.com/questions/24671247
更新时间:2020-03-26 12:00

最满意答案

您可以使用选择器语法根据其CSS类属性检索特定元素:

Document doc = Jsoup.parse(
  new File("input.html"), 
  "UTF-8", 
  "http://sample.com/");

Element allSmallSpan = doc.select("span.Allsmall").first(); // Retrive the first <span> element which belongs to "Allsmall" class

You can use the selector syntax to retrieve a specific element based on its CSS class attribute:

Document doc = Jsoup.parse(
  new File("input.html"), 
  "UTF-8", 
  "http://sample.com/");

Element allSmallSpan = doc.select("span.Allsmall").first(); // Retrive the first <span> element which belongs to "Allsmall" class
2014-07-10

相关文章

更多

最新问答

更多
  • css在元素之前中断列而不破坏包装器(css break column before element without breaking the wrapper)
  • 如何在Xamarin共享项目中使用自定义渲染器(How to use Custom Renderer in Xamarin Shared Project)
  • 如何为特定表中的特定字段设置唯一?(How to set unique for specific field from specific table?)
  • Google SDK iOS - sign()方法完成处理程序(Google SDK iOS - sign() method completion handler)
  • 在具有接口{}值的地图上实现String()(Implement String() on a map with interface{} values)
  • 检查数据库中是否已存在用户名(Check if username already exist in DB)
  • 使用javascript进行ajax调用时阻止用户交互(Block user interaction while doing ajax call using javascript)
  • 什么'if(err)'在Javascript中精确测试?(What does 'if (err)' tests precisely in Javascript?)
  • jQuery mouseleave无法正常工作(jQuery mouseleave not working)
  • 寻求使用的一些说明(Seeking some clarification on use of )
  • 将数组传递给注释的语法(syntax for passing array to annotation)
  • 用于从两个日期范围之间的文件中提取数据的Shell脚本(Shell script to extract data from file between two date ranges)
  • 元素隐藏但父()没有(Element hides but parent() not)
  • 如何使用Google App Engine Java平台开发web ui(How to develop web ui with Google App Engine Java platform)
  • 对于OWL A级;(For an OWL class A; Getting all properties that A is their domain)
  • Excel VBA公式格式问题(Excel VBA Formula Format Issue)
  • ORA - 02287序列号不允许在这里(ORA - 02287 sequence number not allowed here)
  • Github拉忽略特定文件(Github Pull Ignore Specific File)
  • SQL CONVERT函数在SQL Server中工作但不在应用程序中(SQL CONVERT function working in SQL Server but not in application)
  • backbone.js适用于大型应用程序(backbone.js for large applications)
  • 防止程序关闭(Preventing program from closing)
  • 生成不带图像的heightMap(Generating a heightMap without an Image)
  • Bootstrap - 如何将包含文本的div居中?(Bootstrap - How to center div that has text inside it?)
  • Android - 片段findViewById()总是null?(Android - Fragment findViewById() always null?)
  • 确定CSS中的高度(Figuring out heights in CSS)
  • 使用__autoload包含类和使用命名空间(Use __autoload to include class and use namespace)
  • setTimeout()不允许我传递文本值[重复](setTimeout() doesn't allow me to pass text values [duplicate])
  • 在NSUserDefault中恢复值(Restoring value in NSUserDefault)
  • 知道如何将这种下沉的悬停效果添加到图像/链接吗?(Any idea how to add this sinking hover effect to an image/link?)
  • 在XIB中淡入/淡出UISegmentedControl(fade in/fade out UISegmentedControl in XIB)