There's not much here to go on. Are you asking how to write a module that you can import?
Are these the same set of DB files every time? Are the columns and other configurations the same? Are you writing new python code every month?
Are you using some ETL process to spit out a bunch of files that you'd like to have imported and available easily? Are the formats the same but the filenames differ?
You might also be asking: if I write a module, how do I make it available for all my new python projects to use? You could just copy your whatever-my-module-is-called.py file around to your new projects (this might be simplest) but if you're also expecting to be updating it and would like all of your projects to use the updated code, there are alternatives. One is to add the directory containing it to your PYTHONPATH. Another is to install it (in edit mode) in your python environment.
[I get the impression you're a data person rather than a programmer - perhaps you have a colleague who's more of the latter you can tap up for this? It doesn't have to be difficult, but there's typically a little bit of ceremony involved in setting up a shared module however you choose to do it.]
If it is the first thing, just put the db setup code you're using in one file, call it "database.py"
database.py
# the code you commonly use, ending with
database = ...
From a second file in the same directory, write:
main_program.py
from database import database
# The first "database" here is the module name.
# The second "database" is a variable you set inside that module.
# You can also write this as follows:
# import database
# ... and use `database.database` to refer to the same thing
# but that involves "stuttering" throughout your code.
# use `database` as you would before - it refers to the "database" object that was found in the "database.py" module
then run it with python main_program.py
The main thing to realise here is that there are two names involved. One's the module, the other is the variable (or function name) you set inside that module that you want to get access to.
Are you asking how to write a module that you can import?
Yes, kinda.
Are these the same set of DB files every time? Are the columns and other configurations the same? Are you writing new python code every month?
They get updated by the accounting team each month. Some of them are csv, other come from an access database file, other from the sql server.
Some of the code need to be run each month with the updated databases, but there's a lot of ad hoc statistical studies that my boss ask for that use the same databases.
Are you using some ETL process to spit out a bunch of files that you'd like to have imported and available easily? Are the formats the same but the filenames differ?
I guess yes. And not, the accountants keep the same filenames but change the directory lmao.
I think it's the first thing you're after. There are a bunch of tutorials knocking around about this, eg,
Thanks, im checking it out.
how do I make it available for all my new python projects to use?
I get the impression you're a data person rather than a programmer -perhaps you have a colleague who's more of the latter you can tap up for this?
You're right, I'm an actuarie. I wanted to do computer science instead of actuarial sciences, but I tough that it would be better getting an actuarial degree and then doing a masters on CS (still in planning, maybe 2026). I'm the only guy on the company who uses python and people here thinks I'm a genius because I have automated some boring things from excel.
I believe I've actually had this happen with actual VLC, I think I just hit pause and then play and it was fixed. So maybe pause it for half a second after your seek.
Did not work. Thank you for the suggestion, though!
Good news is, for this project, I just learned I can avoid having to do this, so.. at least that's something.
Okay, I actually do need to seek. However, something about going out over bluetooth instead of over the headphone jack has made it work, so.. no idea, but there you go. (Or maybe it's something else that I did, but, either way, it's working, so.. cool.)
Here “database” seems to mean a pandas dataframe. Sounds like you need to create a database using Postgres or sqlite or something similar, and recreate that database from a backup or database dump whenever you need it. You could host that database in the cloud or on your own network as well, if you need access remotely.
I am a little curious about the conditional. I have a suspicion that this is a bit of over engineering.
The problem you seem to be trying to solve is “I need to access the same data in multiple ways, places, or projects.” That’s what a database is really great for. However, if you just need to combine the same csv files you have on disk over and over, why not combine them and dump the output to a csv? Next time you need it, just load the combined csv. FWIW this is loosely what SQLite is doing.
If you are defining a method or function that performs these ETL operations over and over, and the underlying data is not changing, I think updating your local files to be the desired content and format is actually what you want.
If instead you’re trying to explore modules, imports, abstraction, writing DRY code, or other software development fundamentals- great! Play around, it’s a great way to learn how might also recommend picking up some books! Usually your local library has some books on Python development, object oriented programming, and data engineering basics that you might find fascinating (as I have)
There's some data that comes in CSV, other are database files, in the SQL server, excel or web apis. From some of them I need to combine multiple sources with different formags even.
I guess I could have a database with everything more tidy, easier to use, secure and with less failure ratio. I'm still going to prepare the databases (I'm thinking on dataframe objects on a pickle, but I want to experiment with parquetd) so they don't have to be processed every time, but I wanted something I could just write the name of the database and get the update version.
There's some reports that need to be run monthly, they need to be edited each month to add the directories with the new databases and it causes problems, some of them im trying to solve with this. There's also a lot of ad hoc statistics studies I need to do, that use the same bases.
It does sound to me like ingesting all these different formats into a normalized database (aka data warehousing) and then building your tools to report from that centralized warehouse is the way to go. Your warehouse could also track ingestion dates, original format converted from, etc. and then your tools only need to know that one source of truth.
Is there any reason not to build this as a two-step process of 1) ingestion to a central database and 2) reporting from said database?
This sounds kind of like a data warehouse. Depending on the size of the data and number of connections I’d say script or database or module, this is a much bigger problem. Look into dbt (data build tool) and airflow
I would say consider having a script that combines all these sources into a single data mart for your monthly reports. Could also be useful for the ad hoc studies, but idk how much of the same fields you're using for these studies.
This says I want to make mylist be a list where each element of the list (called value here) comes from doing a for loop on range, given the parameters 1, and 20.
If you want to change how each element of this list is, you do it in the first bit on “value”
So you could do
mylist = [value*5 for value in range(1,20)] //5,10,15,…,95 (not 100, because ranges go up to
the last item, not including it (non-inclusive))
list.append returns None so what you've actually got is a list comprehension that generates a list containing the value None 19 times. (using functions with side effects, such as list.append, in list comprehensions are generally bad style so you should avoid this)
The list[...] syntax retrieves elements from the list, which is not what you're trying to do here. (and it is actually invalid syntax in this case)
You should generally avoid calling lists list, because list is already a builtin.
If you want to append the numbers 1 to 19 to a list as you're trying to do you can call the list.extend function with the list comprehension [value for value in range(1, 20)] as the argument. (Although in this case you can also just use the range directly.) To do it without list comprehensions you can simply loop over the range and repeatedly call the append function.
List comprehensions return a new list. For the sake of code clarity, you probably shouldn't change a second list from within a list comprehension. If you're trying to concatenate two lists, you can do so in a second line:
a = list(range(10))
b = [ value for value in range(5) ]
a.extend(b)
# a has 15 elements
print(a)
List comprehension is not whatever you're doing there. An example of list comprehension:
list = [value*2 for value in range(1, 20)]
See, list comprehension is used to make a list from an existing list. The value of the new list is defined by a function. In this case, the value of a will be 2,4,6, etc.
Your current syntax list[...], is trying to access an element of a list.
You can. Whatever the method returns will be the element of that list. So if for example I do this:
def mul(x):
return x*2
list = [mul(value) for value in range(1,20)]
It will have the same effect. But this:
def mul(x):
return
list = [mul(value) for value in range(1,20)]
Will just makes the list element all None
Edit to add more:
List comprehension works not from the range function. Rather, the range function is returning a list. Hence the name, "list comprehension". You can use any old list for it.
What it did under the hood is that it iterates each element on the list that you specify (the in ...), and applies those to the function that you specify in the very first place. If you are familiar with the concept of Array.map in other languages, this is that. There is also a technical explanation for it if it helps, but it requires more time to explain. Just let me know if you would like to know it.
I am aware somewhat of what an array is, as i've dabbled with them in C, and know they can be multi-dimensional. Sorry if I'm being blind, but all I see are function calls in that list comprehension. I think what im asking is stupid, as the range function is returning a list populated.
No problems. Learning a new concept is not stupid. So you are familiar with C. In C term, you are likely to do something like this:
int a[10] = {0}; // Just imagine this is 0,1,2,etc...
int b[10] = {0};
for (int i=0; i < 10; i++) {
b[i] = a[i]*2;
}
A 1 to 1 correspondent might looks like ths:
a = range(10) # 0,1,2,etc...
b = []
for x in a:
b.append(x*2)
However in python, you can then simplify to this:
a = range(10) # Same as before, 0,1,2,etc...
b = [x*2 for x in a]
# This is also works
b = [x*2 for x in [0,1,2,...]]
Remember that list comprehension is used to make a new list, not just iteration. If you want to do something other than making a list from another list, it is better to use iteration. List comprehension is just "syntactic sugar" so to speak. The concept comes from functional programming paradigm.
Not at all. It is indeed helpful to differentiate between an iterable and literal list. After all, sometimes it will bite you in the ass when you don't differentiate between the two.
A list comprehension is used to convert and/or filter elements of another iterable, in your case a range but this could also be another list. So you can think of it as taking one list, filtering/converting each element and producing a new list as a result.
So there's no need to append to any list as that's implicit in the comprehension.
For example, to produce a list of all squares in a range you could do:
[x*x for x in range(10)]
This would automatically "append" each square to the resulting list, there's no need to do that yourself.
The reason that things are much easier with all ASCII data is that practically every Unicode encoding in existence maps bytes 0x00..0x7f to the corresponding code points, so byte strings and Unicode strings that contain the same all-ASCII data are basically equivalent, even semantically. What usually trips people up with non-ASCII data is that the semantic meaning of bytes in the range 0x80..0xff changes from one encoding to another.
But, thinking like a systems programmer again, for many purposes the semantic meaning of bytes 0x80..0xff doesn’t matter. All that matters is that those bytes are preserved unchanged by whatever operations are done. Typical operations like tokenizing strings, looking for markers indicating particular types of data, etc. only need to care about the meaning of bytes in the range 0x00..0x7f; bytes in the range 0x80..0xff are just along for the ride.
So the trick for beating Python 3 strings into submission is to put in encoding and decoding calls where you need to, choosing a single-byte encoding that doesn’t mutate 0x80..0xff. There are many of these; most of the Latin-{1..6} sequence (aka ISO-8859-1..10) is has this property. What you do not want to do is pick utf-8 or any of the multibyte Asian encodings. Latin-1 will do fine; in fact it has an advantage over the others in memory consumption, which we’ll describe below.
Whether depending on this is actually correct or not is beyond me, but it seems like people have actually been using that pass-through behavior in practice and put it into things like Python2 -> 3 migration guides.
The first link suggests that the seemingly undefined ranges are valid as C0 and C1 control codes which may be why it doesn't throw errors.
My use case certainly fall into that described by ESR, I only really need to understand markup that falls in the ASCII range and pass the rest unmodified
Just in case this comment didn't make it explicitly clear, you can just invoke the python binary inside your venv directly and it will automatically locate all the libraries that are installed in your virtual environment.
To show how this works, you can look at the sys.path variable to see which paths python will search for modules when you run import statements. Try running python3 -c 'import sys; print(sys.path)' using your system python, and you will only see system python library paths. Then, try running it again after replacing python3 with the full path to the python3 binary in your venv, and you will see an additional entry in the output with the lib directory in your venv, which shows that python will also look there for modules when an import statement is executed.
Python
Hot
This magazine is not receiving updates (last activity 0 day(s) ago).