首页 \ 问答 \ 用于从两个日期范围之间的文件中提取数据的Shell脚本(Shell script to extract data from file between two date ranges)

用于从两个日期范围之间的文件中提取数据的Shell脚本(Shell script to extract data from file between two date ranges)

我有一个巨大的文件,每行以时间戳开头,如下所示。 我需要一种方法来在两个日期之间画线。 有没有简单的方法使用sed或awk而不是提取每行中的日期字段并比较日/月/年?

例如,需要通过检查第一个字段中的时间戳来提取2013-06-01到2013-06-15之间的数据

文件内容:

2013-06-02T19:44:59;(3305,3308,2338,102116);aaaa;xxxx
2013-06-14T20:01:58;(2338);aaaa;xxxx
2013-06-12T20:01:58;(3305,3308,2338);bbbb;xxxx
2013-06-13T20:01:59;(3305,3308,2338,102116);bbbb;xxxx
2013-06-13T20:02:53;(2338);bbbb;xxxx
2013-06-13T20:02:53;(3305,3308,2338);aaaa2;xxxx
2013-06-13T20:02:54;(3305,3308,2338,102116);aaaa2;xxxx
2013-06-14T20:31:58;(2338);aaaa2;xxxx
2013-06-14T20:31:58;(3305,3308,2338);aaaa;xxxx
2013-06-15T20:31:59;(3305,3308,2338,102116);bbbb;xxxx
2013-06-16T20:32:53;(2338);aaaa;xxxx
2013-06-16T20:32:53;(3305,3308,2338);aaaa2;xxxx
2013-06-16T20:32:54;(3305,3308,2338,102116);bbbb;xxxx

I have a huge file, with each line starting with a timestamp as shown below. I need a way to grep lines between two dates. Is there any easy way to do this using sed or awk instead of extracting out date fields in each line and comparing day/month/year?

example, need to extract data between 2013-06-01 to 2013-06-15 by checking the timestamp in the first field

File contents:

2013-06-02T19:44:59;(3305,3308,2338,102116);aaaa;xxxx
2013-06-14T20:01:58;(2338);aaaa;xxxx
2013-06-12T20:01:58;(3305,3308,2338);bbbb;xxxx
2013-06-13T20:01:59;(3305,3308,2338,102116);bbbb;xxxx
2013-06-13T20:02:53;(2338);bbbb;xxxx
2013-06-13T20:02:53;(3305,3308,2338);aaaa2;xxxx
2013-06-13T20:02:54;(3305,3308,2338,102116);aaaa2;xxxx
2013-06-14T20:31:58;(2338);aaaa2;xxxx
2013-06-14T20:31:58;(3305,3308,2338);aaaa;xxxx
2013-06-15T20:31:59;(3305,3308,2338,102116);bbbb;xxxx
2013-06-16T20:32:53;(2338);aaaa;xxxx
2013-06-16T20:32:53;(3305,3308,2338);aaaa2;xxxx
2013-06-16T20:32:54;(3305,3308,2338,102116);bbbb;xxxx

原文:https://stackoverflow.com/questions/17465686
更新时间:2020-03-23 22:27

最满意答案

它可能不是您的首选,但Perl非常适合这项任务。

perl -ne "print if ( m/2013-06-02/ .. m/2013-06-15/ )" myfile.txt

这种方式的工作方式是,如果第一个触发器匹配(即m/2013-06-02/ ),那么条件( print )将在每一行上执行,直到第二个触发器匹配为止(即m/2013-06-15 )。

但是,如果将m/2013-06-01/指定为触发器,则此技巧将不起作用,因为这在您的文件中永远不会匹配。

一种不太令人兴奋的技术是从每一行中提取一些文本并测试:

perl -ne 'if ( m/^([0-9-]+)/ ) { $date = $1; print if ( $date ge "2013-06-01" and $date le "2013-06-15" ) }' myfile.txt

(测试了表达式和工作)。


It may not have been your first choice but Perl is great for this task.

perl -ne "print if ( m/2013-06-02/ .. m/2013-06-15/ )" myfile.txt

The way this works is that if the first trigger is matched (i.e. m/2013-06-02/) then the condition (print) will be executed on each line until the second trigger is matched (i.e. m/2013-06-15).

However this trick won't work if you specify m/2013-06-01/ as a trigger because this is never matched in your file.

A less exciting technique is to extract some text from each line and test that:

perl -ne 'if ( m/^([0-9-]+)/ ) { $date = $1; print if ( $date ge "2013-06-01" and $date le "2013-06-15" ) }' myfile.txt

(Tested both expressions and working).

2013-07-04

相关文章

更多

最新问答

更多
  • Angular自定义指令调用另一个自定义指令(Angular custom directive calling another custom directive)
  • 如何通过参数将文本解析并附加到SQL Server 2005中的存储过程(How to Parse and Append text to a stored procedure in SQL Server 2005 via a parameter)
  • OpenCV中心Homography(OpenCV Center Homography)
  • 如何在linux shell脚本中计算日期-N?(How to calculate date -N in linux shell scripting?)
  • 如何使用Devise创建用户配置文件?(How Do I Create a User Profile With Devise?)
  • 清理模式以管理树上的多步异步过程(Clean pattern to manage multi-step async processes on a tree)
  • 场景的角色(Role of the Scene)
  • 组件中的Angular 2组件[重复](Angular 2 Component In Component [duplicate])
  • jsPlumb draggable element javascript函数(jsPlumb draggable element javascript function)
  • MVC4:ViewModel(带有radiobuttonlist)在HttpPost之后为空(MVC4: ViewModel (with radiobuttonlist) is empty after HttpPost)
  • 如何在同一帐户上设置“Dev repo”(在prod和团队之间)(How to set up a “Dev repo” (between the prod and the team) on the same account)
  • 如何在tcl中将eth0配置为发送方udp端口(how to configure eth0 as a sender udp port in tcl)
  • 如何使用预定义文本替换来自数据库的部分结果(How do I replace part of result coming from Database with predefined text)
  • Selenium Java注入了新的Javascript函数(Selenium Java inject new Javascript function)
  • 使用.on的多个下拉菜单选择文本仅适用于第一个下拉列表(Multiple Dropdowns Menu Selection text using .on works only on first dropdown)
  • 快速将黄土曲线添加到大型数据集图中的方法(Quick way to add loess curve to large data set graph)
  • FilteringSelect in mvc(FilteringSelect in mvc)
  • 在Delphi XE2中开发Mac或iOS应用程序需要哪些硬件/软件?(What hardware/software is necessary to develop Mac or iOS apps in Delphi XE2?)
  • 在原型的构造函数中初始化属性时获取“未定义”(Getting 'undefined' when a property is initialized in the constructor of a prototype)
  • 通过越狱加载的应用程序的Documents文件夹位置(Location of Documents folder for an app loaded via jailbreak)
  • 在OpenGL中使用可编程和固定管道功能(Using both programmable and fixed pipeline functionality in OpenGL)
  • 将任何用户输入重定向到单独的底层程序(redirect any user input to a separate underlying program)
  • 编辑文本不能正常工作android(Edit texts not working properly android)
  • “user_denied”Facebook应用页面上的Facebook用户区域设置(Facebook user locale on “user_denied” facebook app page)
  • 在大图像中找到小的部分透明图像的坐标(find coordinates of small partially-transparent image within a large image)
  • 我如何在cakephp 3.1中获得完整的相对路径?(How i can get full relative path of image in cakephp 3.1?)
  • 如何保存拖动标记的新本地化?(How to save new localization of dragged marker?)
  • MySQL UPDATE vs INSERT和DELETE(MySQL UPDATE vs INSERT and DELETE)
  • 在执行查询之前,在SQLAlchemy模型中将datetime转换为unix时间戳?(Convert datetime to unix timestamp in SQLAlchemy model before executing query?)
  • OpenCL与OpenGL互操作的优势(Advantage of OpenCL interoperability with OpenGL)