Plus, you can use Python to hack. How cool is that?
I think in 2006 or something it was the highest-rated UK university for student satisfaction.
I still like to avoid abusing re module probably because I started to program when micro computers had very little memory and processor power.
Also, I was using Python on mobile phone (before iphone, Android. With WindowsCE), and first, we didn’t have re module.
Within old python developer community, there is (was?) a tendency to use string search rather than re, possibly a reaction to the extreme abuse of regular expression among some of Perl coders.
Re can be costly for memory, a little (for the machines we have these days), and it’s very easy to make a huge mistake that may even crash the machine (I don’t know if it’s still so, though), especially if you are feeding big data or the data of unknown size, while it’s pretty easy to know what’s you are doing and your mistake in Python code.
And you learn a lot by doing your own parsing.
(It’s important to think about the extreme cases in programming, just like when we are thinking about philosophical matters, as a code that can handle small data may crash with larger data, or no data, or very bad data.)
As for Ryby, it’s not higher or better than Python in most ways.
Web app frame work is also a bit like re module. It can be (and often) over kill, and bloatware.
It increases some risks although it can reduce risks of badly written code if we are not so good.
It’s a little bit like thinking by yourself, or believing a religion.
When you use a framework, you become a follower of it, and trust what the developer is doing.
When you write by yourself, you are taking the burden of doing things by yourself and getting the reward and punishment on your own responsibility.
It can be too much for some, but it’s more fun and we learn a lot from it.
As I started to program doing everything from the scratch, I got used to the troublesome part of writing your own things, too.
And I do use what others have written, of course.
But I still love to use at least open source software because I can read the source and understand what it’s doing, and I can modify it if I wanted. The last part is important because we would be enslaved and at mercy of the developer for bug fix and improvement, without access to the source code. We suffered a lot (because of MS, etc) in the past.
I like Python because it’s usually highly readable and very easy to make modifications, too.
If you learn assembler, and do some work understanding a code written by 50 different people to do modification at lower level, you would understand what I mean.
So how much MORE memory does calling a module use? And I take it that is why you see things like from [module] import [sub]?
RE seems pretty small. Kind of like a lightweight gun or sword as opposed to your Kung fu. I get your point. I have pretty crazy OCD I just lead untreated. I will eventually be known as an esoteric minimalist.
Will it teach you manners? if so… then I’m happy for you otherwise… no!
No they don’t teach that at university ya ding dong.
I’d say not much, by today’s standard.
My two main machines have 8GB and 6GB of memory. One has 1 TB of HDD and other one has 128 GB of SSD.
The computer I made about 30 years ago had only 4 KB of memory and no external storage (well I was planning to connect paper tape puncher and reader, but I abandoned, later).
If you are using linux machine or virtual box, you can think about the python code to check the memory consumption of the re module, when it’s just imported and while in use with different size of data and regex.
Tell me if you find you, as I’m a bit curious, too.
I do remember the comparison of different language I did on a shared hosting server several years ago, though.
It was like this, both size wise and speed wise:
C ==> Ocaml ========> Perl ====> Python ==> Ruby ================================> PHP
Python is doing better these days, thanks to Pyrex, Cython, and especially Pypy.
PHP is supposedly becoming more “normal”, after all, too.
As long as you know what you are doing, re IS very useful and it an be even quicker, smaller than chunk of badly written code, for sure.
And if speed or size isn’t critical, there is no real need to avoid it, as much as I may do.
I just like to do things is certain odd way, and I don’t remember regretting about it.
This week we’re doing functions and using modules. So I can work some Kung fu into it this time.
I think I’m going to try and use the PYTHONPATH to point to a web server so I can work in modules hosted online.
Do you mean including network drive in PYTHONPATH, like \serverX\somewhere\ (in win),
or to use http or ftp to download module file(s) (with urllib2, for example)?
Actually the latter, now that I think about it.
But that might be jumping too far ahead for the class’s purposes lol
I kind of got in trouble for using re for the first assignment, but he said we could solve it any way.
I guess he is standard “Pythonic” guy, which is good, and maybe he was expecting something like I showed without re.
Now, I’m not so sure of the module loading assignment.
Download a module via http/ftp and importing doesn’t sound like basic course material.
I mean, we have to think about security issues when we do something like this, and there can be other issues depending on what kind of module.
Yea, that is a good point, so actually maybe I will not do that after all.
Thanks for your thoughts so far
So I am onto the second assignment, which is similar to this one.
I want to use this, except if I use this, it only works for a string where the tags are beside each other. In the XML file I’m using they are separated onto different lines. Is there a way to overcome that challenge?
Have you tried the S flag?
There is a flag that makes . to match with new line \n (which isn’t the case without the flag).
You should be able to find in the documentation for re module. docs.python.org/2/library/re.html
In this case, it’s very important to use .? because . would eats up everything until the last sets of tags.
Also, the one without re module doesn’t care if the tags are in the same line or not.
When we write something by ourselves, we don’t have to read the documents.
Yes, looking at your function more (and trying it) I see the advantage. I can’t use this unless I understand it in full, though, or else what is the point of the class?
i1 = i2 = i3 = 0 # initialize indexes
i4 = -t2i - 1 # initialize the last index so that it starts as 0
What does that part mean?
Or I guess, rather why isn’t it something like
i1 = i2 = i3 = i4 = 0
or
i2 = i3 = i4 = 0
i1 = -t1i - 1
? That is, if we’re saying ‘while 1’ and looping through? I guess I see it as kind of backwards. Thanks for all your help, btw.
These indexes keep the place where the search for the next tag starts.
i1, i2, i3 don’t have to be initialized, because their initial value would be set in the function before their use.
You can remove the line, actually.
I put the initialization to contrast against the i4, which needs to be initialized to give 0 (after calculation) in the first loop.
From the second time on, in the loop, i4 will have the correct value, but it’s undetermined in the initial loop.
Each index should indicate the location as below.
s = “blah(i1)pname(i2)foo(i3)$12345(i4)bar”
The string search done by the index method should start right after the matched tag, and not at the index location (because I decided this way so that I don’t have to do calculation in the string slicing when we set the dictionary. Just a matter of preference).
Hmm.