Python Mutable vs Immutable

Here is a basic introduction to immutable and immutable types in python.

In python there are two types of data… mutable and immutable. Numbers, strings, boolean, tuples, and other simple types are immutable. Dicts, lists, sets, objects, classes, and other complex types are mutable.

When you say:

a = [1,2,3]
b = a

You’ve created a single mutable list in memory, assigned a to point to it, and then assigned b to point to it. It’s the same thing in memory.

Therefore when you mutate it (modify it):

b[0] = 3

It is a modification (mutation) of the index [0] of the value which b points to at that same memory location.

However, when you replace it:

b = [0,0,0]

It is creating a new mutable list in memory and assigning b to point at it.


Check out the id() function. It will tell you the “address” of any variable. You can see which names are pointing to the same memory location with id(varname).


Bonus: Every value in python is passed by reference… meaning that when you assign it to a variable it simply causes that variable to point to that value where it was in memory. Having immutable types allows python to “reuse” the same memory location for common immutable types.

Consider some common values when the interpreter starts up.  You can see here there are a lot of variables pointing at the memory location held by abc.  cpython, at least, is smart enough to realize that the value `abc` is already stored in memory and because it is immutable, just returns that same memory address.

>>> import sys
>>> sys.getrefcount('abc')
68
>>> sys.getrefcount(100)
110
>>> sys.getrefcount(2)
6471

However, a value that is definitely not present would return 2. This has to do with the fact that a couple of references to that value were in-use during the call to sys.getrefcount

>>> sys.getrefcount('nope not me.  I am definitely not here already.')
2

Notice that an empty tuple has a lot of references:

>>> sys.getrefcount(tuple())
34571

But an empty list has no extra references:

>>> sys.getrefcount(list())
1

Why is this? Because tuple is immutable so it is fine to share that value across any number of variables. However, lists are mutable so they MUST NOT be shared across arbitrary variables or changes to one would affect the others.

Incidentally, this is also why you must NEVER use mutable types as default argument values to functions. Consider this innocent little function:

>>> def foo(value=[]):
...     value.append(1)
...     print(value)
...
...

When you call it you might expect to get [1] printed…

>>> foo()
[1]

However, when you call it again, you prob. won’t expect to get [1,1] out… ???

>>> foo()
[1, 1]

And on and on…

>>> foo()
[1, 1, 1]

>>> foo()
[1, 1, 1, 1]

WHY IS THIS? Because default arguments to functions are evaluated once during function definition, and not at function run time. That way if you use a mutable value as a default argument value, then you will be stuck with that one value, mutating in unexpected ways as the function is called multiple times.

The proper way to do it is this:

>>> def foo(value=None):
...     if value is None:
...         value = []
...     value.append(1)
...     print(value)
...
...
>>>
>>> foo()
[1]
>>> foo()
[1]
>>> foo()
[1]

Stain Dipper Parts

Here is a screenshot of some of the parts we’ve designed for the Stain Dipper.  All of the mechanical parts are in the design and correctly positioned and all the hardware has been ordered…

Here are several of the parts from Fusion 360

Fusion360_2018-07-05_02-41-05

Here is a servo motor mount and custom made pulley.

Frame v27

Here is a small piece of Nylon used to connect a 1/16″ cable with a aluminum tube.

Fusion360_2018-07-03_22-27-15

How to install a Trusted Certificate Authority on Windows 7

At my company AppCove, we have our own certificate authority that we use with development servers and sites.  This allows us to (at no additional cost) use HTTPS and SSL for all of these alternate domains and subdomains.

The downside is that our certificate is not trusted by any stock browser or operating system.

Therefore, to prevent getting an ugly and scary SSL warning, anyone who needs to visit these (private audience) sites must first “trust” our certificate authority.

A note on security.  If you are telling your computer to trust a certificate authority, then you must really actually “trust” that authority.  If the signing key fell into the wrong hands, then they could create fake certificates for other sites you visit, like http://www.google.com, and intercept your data.  At AppCove, we use aggressive security measures to protect the certificate authority key (as we do for customer data and applications).

In this example, I am causing my Windows 7 workstation to trust appcove-ca-cert.pem.crt

a

b

c

d

e

f

g

h

i

j

k

l

m

n

— Start of slight detour — 

If you want to verify it was installed, do this.  Otherwise, skip the next 2 screens.

o

p

— End of slight detour —

q

r

At this point, you should be able to visit any HTTPS site that was signed with this certificate authority and your browser will indicate that it is a secure connection.

Introducing FileStruct (for Python)

FileStruct is a lightweight and fast file-cache / file-server designed for web-applications.  It solves the problems of “where do I save all of those uploads” that has been encountered time and time again.  FileStruct uses the local filesystem, but in a sensible way (keeping permissions sane), and with the ability to secure it to a reasonable level.

https://github.com/appcove/FileStruct/

Here is a simple example of taking an image upload, resizing, and saving it:

with client.TempDir() as TempDir:
   open(TempDir.FilePath('upload.jpg'), 'wb').write(mydata)
   TempDir.ResizeImage('upload.jpg', 'resize.jpg', '100x100')
   hash1 = TempDir.Save('upload.jpg')
   hash2 = TempDir.Save('resize.jpg')

Design Goals

Immutable Files

FileStruct is designed to work with files represented by the SHA-1 hash of their contents. This means that all files in FileStruct are immutable.

High Performance

FileStruct is designed as a local repository of file data accessable (read/write) by an application or web application. All operations are local I/O operations and therefore, very fast.

Where possible, streaming hash functions are used to prevent iterating over a file twice.

Direct serving from Nginx

FileStruct is designed so that Nginx can serve files directly from it’s Data directory using an X-Accel-Redirect header. For more information on this Nginx configuration directive, see http://wiki.nginx.org/XSendfile

Assuming that nginx runs under nginx user and file database is owned by the fileserver group, nginx needs to be in thefileserver group to serve files:

# usermod -a -G fileserver nginx

Secure

FileStruct is designed to be as secure as your hosting configuration. Where possible, a dedicated user should be allocated to read/write to FileStruct, and the database directory restricted to this user.

Simple

FileStruct is designed to be incredibly simple to use.

File Manipulaion

FileStruct is designed to simplify common operations on files, especially uploaded files. Image resizing for thumbnails is supported.

Temporary File Management

FileStruct is designed to simplify the use of Temp Files in an application. The API supports creation of a temporary directory, placing files in it, Ingesting files into FileStruct, and deleting the directory when completed (or retaining it in the event of an error)

Garbage Collection

FileStruct is designed to retain files until garbage collection is performed. Garbage collection consists of telling FileStruct what files you are interested in keeping, and having it move the remaining files to the trash.

Backup and Sync with Rsync

FileStruct is designed to work seamlessly with rsync for backups and restores.

Atomic operations

At the point a file is inserted or removed from FileStruct, it is a filesystem move operation. This means that under no circumstances will a file exist in FileStruct that has contents that do not match the name of the file.

No MetaData

FileStruct is not designed to store MetaData. It is designed to store file content. There may be several “files” which refer to the same content. empty.logempty.txt, and empty.ini may all refer to the empty fileData/da/39/da39a3ee5e6b4b0d3255bfef95601890afd80709. However, this file will be retained as long as any aspect of the application still uses it.

Automatic De-Duplication

Because file content is stored in files with the hash of the content, automatic file-level de-duplication occurs. When a file is pushed to FileStruct that already exists, there is no need to write it again.

This carries the distinct benifit of being able to use the same FileStruct database across multiple projects if desired, because the content of file Data/da/39/da39a3ee5e6b4b0d3255bfef95601890afd80709 is always the same, regardless of the application that placed it there.

Note: In the event that multiple instances or applications use the same database, the garbage collection routine MUST take all references to a given hash into account, across all applications that use the database. Otherwise, it would be easy to delete data that should be retained.

How to Generate a SSH Keypair (public/private) on Windows

Have you ever been asked to generate an SSH keypair in order to gain access to a server, github, or an sftp site?

Here is how on windows.

First, download puttygen.exe from here:

http://www.chiark.greenend.org.uk/~sgtatham/putty/download.html

 

Second, run puttygen.exe and follow these instructions:

(except, put your name instead of Sharon)

(On step 8, copy and paste this and send it to whomever requested it)

puttygen instructions

Simulating ENUM in PostgreSQL using CHECK expression

PostgreSQL is a very powerful database. One of the things that seems missing when moving from MySQL is the ability to simply create an enumeration. ENUM is nice when you have a programmatically-semantic set of values for a field.

In PostgreSQL, you have several choices. But one simple one is to create a Check expression, like follows. Skip the IS NULL part if you don’t want the field nullable.

ALTER TABLE 
   "public"."CallCenter_Transfer"
ADD CONSTRAINT 
   "CallCenter_Transfer_TransferStatus"
   CHECK (
      "TransferStatus" IS NULL 
       OR 
      "TransferStatus" = ANY(ARRAY['TransferComplete', 'TransferFailedNoAnswer', 'TransferFailedProspectLost'])
      )