• Welcome to Overclockers Forums! Join us to reply in threads, receive reduced ads, and to customize your site experience!

Python: What is the best way to do this (regex)

Overclockers is supported by our readers. When you click a link to make a purchase, we may earn a commission. Learn More.

Stratus_ss

Overclockix Snake Charming Senior, Alt OS Content
Joined
Jan 24, 2006
Location
South Dakota
Given the following code:

Code:
import re

stable_version = "([\d\.]+[02468])"

line = "http://ftp.gnome.org/pub/gnome/sources/yelp/([\d\.]+[02468])/yelp-([\d\.]+)\.tar\.xz"

print re.findall(r'\[\\d\\.\]\+*.*/', line)

What is the best way to return both the "([\d\.]+[02468])" and "([\d\.]+)". What I am ultimately trying to do is set both of the "([\d\.]+" sections of the line to the same. So if I pass in "stable_version" or "unstable_version" each instanace of "([\d\.]+" gets set to the same value

Thanks for any pointers

EDIT: here is what I am currently running with:

Code:
uscan_version = "([\d\.]+[02468])"
uscan_directories = ["/some/path/yelp-3.10.1"]
for directory in uscan_directories:
        os.chdir(directory)
        print(("Beginning uscan of %s" % directory))
        for line in fileinput.FileInput(watch_file, inplace=True):
                if "http" in line:
                    joined_line = "-".join(os.path.basename(directory).split("-")[:-1])
                    replace_this_regex = re.findall(r'%s/(.*?)/' % joined_line, line)[0].replace("'\\'", "'\'")
                    print(line.replace(replace_this_regex, uscan_version).rstrip()),
                else:
                    print(line),
 
I'm a little unclear what you're trying to accomplish here, could you give an example of expected input and output?
 
Sure no problem

Given this line:

Code:
line = "http://ftp.gnome.org/pub/gnome/sources/yelp/([\d\.]+[02468])/yelp-([\d\.]+)\.tar\.xz"

I want to be able to pass in "stable" to my script and turn the line into:

Code:
line = "http://ftp.gnome.org/pub/gnome/sources/yelp/([\d\.]+[02468])/yelp-([\d\.]+[02468])\.tar\.xz"

If I pass in "unstable" I would get:

Code:
line = "http://ftp.gnome.org/pub/gnome/sources/yelp/([\d\.]+)/yelp-([\d\.]+)\.tar\.xz"

Of course the purpose is to rip through a huge number of urls, so I thought that a regex would be best, especially considering that its unpredictable whether the entry will look like "([\d\.]+[02468])" or "([\d\.]+)" by default.

Is this more clear?
 
could you use replace function? idk how cleanly it would work, but it would be like if you had the url's all in a txt document... if im understanding correctly...
Code:
import os
f = open('urls.txt')
s = ([\d\.]+[02468])
m = f.read().replace('([\d\.]+)',(s))
print (m)
f.close()
f = open('urls2.txt', 'w')
f.write(m)
will open one txt document edit all the urls' and save them to another text document.

something along those lines. just reverse the "s" and "replace" dealios if you wanna switch them back to unstable.

edit: i just reread your code and it seems similar... although its kind of hard to understand how its working does ([\d\.]+) stand for unstable basically? and you wanan change it to ([\d\.]+[02468])

and i deleted a line from my deal it wasnt needed for this context i actually pulled it from the texter program setup lol it opens as txt and saves in .py. itl actually save to what ever file extension youd like, probably open what ever type as well.
 
Last edited:
I'm not a python guy (by any means), but can you not set the pattern based on arguments, ie, whatever the py equivalent to this is:?

Code:
switch(argv[0])
{
  "stable")
               pattern = "([\d\.]+)"
               break;
  "unstable")
               pattern = "([\d\.]+[02468])"
               break;

line = "http://ftp.gnome.org/pub/gnome/sources/yelp/" + pattern + "/yelp-" + pattern + "\.tar\.xz"
}

Or something similar?


[edit]
I guess python doesn't have a switch/case equivalent, so you could use if/else, or dictionaries apparently.
 
Last edited:
I figured I'd take this opportunity to google some python.

What about something like this:
Code:
Justin@Justin-Desktop ~
$ cat test.py
import sys

if sys.argv[1] == "stable":
        pattern = "([\d\.]+)"
elif sys.argv[1] == "unstable":
        pattern = "([\d\.]+[02468])"
else:
        print "other"
line = "http://ftp.gnome.org/pub/gnome/sources/yelp/" + pattern + "/yelp-" + pattern + "\.tar\.xz"
print line

Justin@Justin-Desktop ~
$ python test.py stable
http://ftp.gnome.org/pub/gnome/sources/yelp/([\d\.]+)/yelp-([\d\.]+)\.tar\.xz

Justin@Justin-Desktop ~
$ python test.py unstable
http://ftp.gnome.org/pub/gnome/sources/yelp/([\d\.]+[02468])/yelp-([\d\.]+[02468])\.tar\.xz

It seems like maybe I still don't quite get what you're asking.
 
First of all, thanks for the replies, nothing worse than a thread that gets no attention.

So here is what I am trying to achieve. When you pull down source code in debian, there is a file in there called the "watch" file. This file has a URL to the upstream package location.

Since we are using yelp as an example, if debian has yelp v 3.8.2 and I want to pull down the absolute latest version, you change the watch file so that it reads "([\d\.]+)" in the url since "stable" versions are even numbers and "unstable" versions are odd. By changing the watch file from "([\d\.]+[02468])" to "([\d\.]+)", version 3.11.3 will be pulled down.

if there are a number of packages which need to be pulled down (not just yelp), it is easier to script this (I thought). My first attempt sort of works, but I was only chaning the minor release version. So on the ftp site they have

<package>/<series>/<point release>

The way my script works currently is it will do this

<package>/<series>/<unstable/stable>

I want it to do

<package>/<ustable/stable>/<unstable/stable>

In this case, PCGAMER's code will not work because I would still have to find via some method, the location of "([\d\.]+)" in the file and replace it with a variable, then toggle the variable.

Hopefully the goal is more clear now
 
What about updating the watch file dynamically with something like sed (or a python implementation of it)?

[edit]

Re-reading your previous posts, it looks like that's what you're doing already.
 
Ya i guess i can hack something out, i was just hoping for a more efficient way to do it. I can just modify each pattern individually, i was just hoping to do it at once
 
Back