Python

This magazine is not receiving updates (last activity 0 day(s) ago).

Veraxus , in PuePy: a pure Python frontend web framework

Eww, David.

sus ,

what?

Veraxus ,
gedhrel , in best way to access databases in different projects

There's not much here to go on. Are you asking how to write a module that you can import?

Are these the same set of DB files every time? Are the columns and other configurations the same? Are you writing new python code every month?

Are you using some ETL process to spit out a bunch of files that you'd like to have imported and available easily? Are the formats the same but the filenames differ?

I think it's the first thing you're after. There are a bunch of tutorials knocking around about this, eg, https://www.digitalocean.com/community/tutorials/how-to-write-modules-in-python-3

You might also be asking: if I write a module, how do I make it available for all my new python projects to use? You could just copy your whatever-my-module-is-called.py file around to your new projects (this might be simplest) but if you're also expecting to be updating it and would like all of your projects to use the updated code, there are alternatives. One is to add the directory containing it to your PYTHONPATH. Another is to install it (in edit mode) in your python environment.

[I get the impression you're a data person rather than a programmer - perhaps you have a colleague who's more of the latter you can tap up for this? It doesn't have to be difficult, but there's typically a little bit of ceremony involved in setting up a shared module however you choose to do it.]

gedhrel ,

If it is the first thing, just put the db setup code you're using in one file, call it "database.py"

database.py

# the code you commonly use, ending with
database = ...

From a second file in the same directory, write:
main_program.py

from database import database
# The first "database" here is the module name.
# The second "database" is a variable you set inside that module.
# You can also write this as follows:
# import database
# ... and use `database.database` to refer to the same thing
# but that involves "stuttering" throughout your code.

# use `database` as you would before - it refers to the "database" object that was found in the "database.py" module

then run it with python main_program.py

The main thing to realise here is that there are two names involved. One's the module, the other is the variable (or function name) you set inside that module that you want to get access to.

driving_crooner OP ,
@driving_crooner@lemmy.eco.br avatar

Are you asking how to write a module that you can import?

Yes, kinda.

Are these the same set of DB files every time? Are the columns and other configurations the same? Are you writing new python code every month?

They get updated by the accounting team each month. Some of them are csv, other come from an access database file, other from the sql server.

Some of the code need to be run each month with the updated databases, but there's a lot of ad hoc statistical studies that my boss ask for that use the same databases.

Are you using some ETL process to spit out a bunch of files that you'd like to have imported and available easily? Are the formats the same but the filenames differ?

I guess yes. And not, the accountants keep the same filenames but change the directory lmao.

I think it's the first thing you're after. There are a bunch of tutorials knocking around about this, eg,

Thanks, im checking it out.

how do I make it available for all my new python projects to use?

import sys sys.path.append('my\\modules\\directory) import my_module

I get the impression you're a data person rather than a programmer -perhaps you have a colleague who's more of the latter you can tap up for this?

You're right, I'm an actuarie. I wanted to do computer science instead of actuarial sciences, but I tough that it would be better getting an actuarial degree and then doing a masters on CS (still in planning, maybe 2026). I'm the only guy on the company who uses python and people here thinks I'm a genius because I have automated some boring things from excel.

gedhrel ,

If things are changing a bit each month, then in your module rather than a plain variable assignment

darabase = ...

you might want a function that you can pass in parameters to represent the things that can change:

def database(dir, ...):
    ...
    return ...

Then you can call it like this:

from database import database
db = database("/some/path")

... gope that makes some sense.

eager_eagle , in best way to access databases in different projects
@eager_eagle@lemmy.world avatar

But I would like to have a module that I could import and have all my databases and configuration of ETL[...]

ok, then write a module. I'm not sure what's being asked. The best way is what works well for you.

milkisklim , in best way to access databases in different projects

I'm not the biggest expert, but wouldn't this be the whole point of polars's lazy construction?

driving_crooner OP ,
@driving_crooner@lemmy.eco.br avatar

Never heard of that, just saw a video and even if isn't exactly what I need it's looked really cool.

SteveTech , in vlc.py -- setting the time in a song sets the song stuttering and it doesn't really recover

I believe I've actually had this happen with actual VLC, I think I just hit pause and then play and it was fixed. So maybe pause it for half a second after your seek.

TheButtonJustSpins OP ,

Did not work. Thank you for the suggestion, though!
Good news is, for this project, I just learned I can avoid having to do this, so.. at least that's something.

TheButtonJustSpins OP ,

Okay, I actually do need to seek. However, something about going out over bluetooth instead of over the headphone jack has made it work, so.. no idea, but there you go. (Or maybe it's something else that I did, but, either way, it's working, so.. cool.)

originalfrozenbanana , in best way to access databases in different projects

Here “database” seems to mean a pandas dataframe. Sounds like you need to create a database using Postgres or sqlite or something similar, and recreate that database from a backup or database dump whenever you need it. You could host that database in the cloud or on your own network as well, if you need access remotely.

For instance see this pandas doc https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.to_sql.html

driving_crooner OP ,
@driving_crooner@lemmy.eco.br avatar

Thanks, I could solve it creating a file with a def get_database(name):

if name == 'database':

all the process to create the database

return database

And then df = get_database('database') execute all the processes and return it.

originalfrozenbanana ,

I am a little curious about the conditional. I have a suspicion that this is a bit of over engineering.

The problem you seem to be trying to solve is “I need to access the same data in multiple ways, places, or projects.” That’s what a database is really great for. However, if you just need to combine the same csv files you have on disk over and over, why not combine them and dump the output to a csv? Next time you need it, just load the combined csv. FWIW this is loosely what SQLite is doing.

If you are defining a method or function that performs these ETL operations over and over, and the underlying data is not changing, I think updating your local files to be the desired content and format is actually what you want.

If instead you’re trying to explore modules, imports, abstraction, writing DRY code, or other software development fundamentals- great! Play around, it’s a great way to learn how might also recommend picking up some books! Usually your local library has some books on Python development, object oriented programming, and data engineering basics that you might find fascinating (as I have)

ericjmorey ,
@ericjmorey@programming.dev avatar

My local library gives me access to O'Reilly Online, so free textbook access for just about any topic.

driving_crooner OP ,
@driving_crooner@lemmy.eco.br avatar

There's some data that comes in CSV, other are database files, in the SQL server, excel or web apis. From some of them I need to combine multiple sources with different formags even.

I guess I could have a database with everything more tidy, easier to use, secure and with less failure ratio. I'm still going to prepare the databases (I'm thinking on dataframe objects on a pickle, but I want to experiment with parquetd) so they don't have to be processed every time, but I wanted something I could just write the name of the database and get the update version.

odium ,

What are you trying to output in the end (dashboard? Report? Table?), how often are these inputs coming in, and how often do you run your process?

driving_crooner OP ,
@driving_crooner@lemmy.eco.br avatar

There's some reports that need to be run monthly, they need to be edited each month to add the directories with the new databases and it causes problems, some of them im trying to solve with this. There's also a lot of ad hoc statistics studies I need to do, that use the same bases.

4am ,

It does sound to me like ingesting all these different formats into a normalized database (aka data warehousing) and then building your tools to report from that centralized warehouse is the way to go. Your warehouse could also track ingestion dates, original format converted from, etc. and then your tools only need to know that one source of truth.

Is there any reason not to build this as a two-step process of 1) ingestion to a central database and 2) reporting from said database?

originalfrozenbanana ,

This sounds kind of like a data warehouse. Depending on the size of the data and number of connections I’d say script or database or module, this is a much bigger problem. Look into dbt (data build tool) and airflow

driving_crooner OP ,
@driving_crooner@lemmy.eco.br avatar

I have a Datawerehouse some of the dabases I got come from there, but can only be accessed in the virtual machine.

odium ,

I would say consider having a script that combines all these sources into a single data mart for your monthly reports. Could also be useful for the ad hoc studies, but idk how much of the same fields you're using for these studies.

stevedidwhat_infosec , in Why can't I append to list inside of a list comprehension?

Not quite!

Try:

mylist = [value for value in range(1,20)]

This says I want to make mylist be a list where each element of the list (called value here) comes from doing a for loop on range, given the parameters 1, and 20.

If you want to change how each element of this list is, you do it in the first bit on “value”

So you could do

mylist = [value*5 for value in range(1,20)] //5,10,15,…,95 (not 100, because ranges go up to
the last item, not including it (non-inclusive))

Etc. Hope this makes sense!

Edit: MISSING CLOSING PARENTHESIS DOH

decivex , in Why can't I append to list inside of a list comprehension?
  1. list.append returns None so what you've actually got is a list comprehension that generates a list containing the value None 19 times. (using functions with side effects, such as list.append, in list comprehensions are generally bad style so you should avoid this)
  2. The list[...] syntax retrieves elements from the list, which is not what you're trying to do here. (and it is actually invalid syntax in this case)
  3. You should generally avoid calling lists list, because list is already a builtin.

If you want to append the numbers 1 to 19 to a list as you're trying to do you can call the list.extend function with the list comprehension [value for value in range(1, 20)] as the argument. (Although in this case you can also just use the range directly.) To do it without list comprehensions you can simply loop over the range and repeatedly call the append function.

eager_eagle , in Why can't I append to list inside of a list comprehension?
@eager_eagle@lemmy.world avatar

List comprehensions return a new list. For the sake of code clarity, you probably shouldn't change a second list from within a list comprehension. If you're trying to concatenate two lists, you can do so in a second line:

a = list(range(10))
b = [ value for value in range(5) ]
a.extend(b)

# a has 15 elements
print(a)
bitfucker , in Why can't I append to list inside of a list comprehension?

List comprehension is not whatever you're doing there. An example of list comprehension:

list = [value*2 for value in range(1, 20)]

See, list comprehension is used to make a list from an existing list. The value of the new list is defined by a function. In this case, the value of a will be 2,4,6, etc.

Your current syntax list[...], is trying to access an element of a list.

librejoe OP ,

So you cannot use methods inside a list comprehension, only binary operators and the function range?

onlinepersona ,

Sure you can. As others have said, a list comprehension returns a new list. See the documentation.

What are you trying to do though? Append a list comprehension to an existing list?

See a modified version of @eager_eagle's code from their comment.

def double(x):
  return 2 * x
a = list(range(10))
a.extend(double(value) for value in range(5))

# a has 15 elements
print(a)

Anti Commercial-AI license

bitfucker , (edited )

You can. Whatever the method returns will be the element of that list. So if for example I do this:

def mul(x):
  return x*2

list = [mul(value) for value in range(1,20)]

It will have the same effect. But this:

def mul(x):
  return

list = [mul(value) for value in range(1,20)]

Will just makes the list element all None

Edit to add more:
List comprehension works not from the range function. Rather, the range function is returning a list. Hence the name, "list comprehension". You can use any old list for it.

What it did under the hood is that it iterates each element on the list that you specify (the in ...), and applies those to the function that you specify in the very first place. If you are familiar with the concept of Array.map in other languages, this is that. There is also a technical explanation for it if it helps, but it requires more time to explain. Just let me know if you would like to know it.

librejoe OP ,

Thanks for the response.

I am aware somewhat of what an array is, as i've dabbled with them in C, and know they can be multi-dimensional. Sorry if I'm being blind, but all I see are function calls in that list comprehension. I think what im asking is stupid, as the range function is returning a list populated.

bitfucker ,

No problems. Learning a new concept is not stupid. So you are familiar with C. In C term, you are likely to do something like this:

int a[10] = {0}; // Just imagine this is 0,1,2,etc...
int b[10] = {0};
for (int i=0; i < 10; i++) {
  b[i] = a[i]*2;
}

A 1 to 1 correspondent might looks like ths:

a = range(10) # 0,1,2,etc...
b = []
for x in a:
  b.append(x*2)

However in python, you can then simplify to this:

a = range(10) # Same as before, 0,1,2,etc...
b = [x*2 for x in a]

# This is also works
b = [x*2 for x in [0,1,2,...]]

Remember that list comprehension is used to make a new list, not just iteration. If you want to do something other than making a list from another list, it is better to use iteration. List comprehension is just "syntactic sugar" so to speak. The concept comes from functional programming paradigm.

librejoe OP ,

Great explanation! I don't know too much C, just a bit here and there, and my dad's copy of K&R C he gave to me.

decivex ,

I know I'm being somewhat pedantic but range() returns an iterable range type, not a list, in python 3.

bitfucker ,

Not at all. It is indeed helpful to differentiate between an iterable and literal list. After all, sometimes it will bite you in the ass when you don't differentiate between the two.

polaris64 , in Why can't I append to list inside of a list comprehension?

A list comprehension is used to convert and/or filter elements of another iterable, in your case a range but this could also be another list. So you can think of it as taking one list, filtering/converting each element and producing a new list as a result.

So there's no need to append to any list as that's implicit in the comprehension.

For example, to produce a list of all squares in a range you could do:

[x*x for x in range(10)]

This would automatically "append" each square to the resulting list, there's no need to do that yourself.

e0qdk , in ISO-8859-x encodings and invalid bytes
@e0qdk@reddthat.com avatar

I was curious, so I did some searches on this topic for you and found these pages:

The second link in particular notes:

The reason that things are much easier with all ASCII data is that practically every Unicode encoding in existence maps bytes 0x00..0x7f to the corresponding code points, so byte strings and Unicode strings that contain the same all-ASCII data are basically equivalent, even semantically. What usually trips people up with non-ASCII data is that the semantic meaning of bytes in the range 0x80..0xff changes from one encoding to another.

But, thinking like a systems programmer again, for many purposes the semantic meaning of bytes 0x80..0xff doesn’t matter. All that matters is that those bytes are preserved unchanged by whatever operations are done. Typical operations like tokenizing strings, looking for markers indicating particular types of data, etc. only need to care about the meaning of bytes in the range 0x00..0x7f; bytes in the range 0x80..0xff are just along for the ride.

So the trick for beating Python 3 strings into submission is to put in encoding and decoding calls where you need to, choosing a single-byte encoding that doesn’t mutate 0x80..0xff. There are many of these; most of the Latin-{1..6} sequence (aka ISO-8859-1..10) is has this property. What you do not want to do is pick utf-8 or any of the multibyte Asian encodings. Latin-1 will do fine; in fact it has an advantage over the others in memory consumption, which we’ll describe below.

Whether depending on this is actually correct or not is beyond me, but it seems like people have actually been using that pass-through behavior in practice and put it into things like Python2 -> 3 migration guides.

The first link suggests that the seemingly undefined ranges are valid as C0 and C1 control codes which may be why it doesn't throw errors.

pantyhosewimp ,

Maybe I’m tripping here but this kinda also explains why the human genome contains lots of noncoding DNA.

Fred OP ,

I think it has more to do with the fact Mother Nature is really inefficient and allocated much more DNA storage than necessary.

Fred OP ,

Thank you for the pointers @e0qdk

My use case certainly fall into that described by ESR, I only really need to understand markup that falls in the ASCII range and pass the rest unmodified

namingthingsiseasy , in Recommended way to run my scripts from a venv?

Just in case this comment didn't make it explicitly clear, you can just invoke the python binary inside your venv directly and it will automatically locate all the libraries that are installed in your virtual environment.

To show how this works, you can look at the sys.path variable to see which paths python will search for modules when you run import statements. Try running python3 -c 'import sys; print(sys.path)' using your system python, and you will only see system python library paths. Then, try running it again after replacing python3 with the full path to the python3 binary in your venv, and you will see an additional entry in the output with the lib directory in your venv, which shows that python will also look there for modules when an import statement is executed.

gitamar , in Recommended way to run my scripts from a venv?

I use pipenv with pyenv together. This works pretty well, also in cron jobs. Just add pipenv run python script.py to the cron table.

dallen , in Recommended way to run my scripts from a venv?

You could package it and install with pipx

  • All
  • Subscribed
  • Moderated
  • Favorites
  • [email protected]
  • kbinchat
  • All magazines