Is there a Haskell idiom for updating a nested data structure?
Record update syntax comes standard with the compiler:
addManStk team = team {
manager = (manager team) {
diet = (diet (manager team)) {
steaks = steaks (diet (manager team)) + 1
}
}
}
Terrible! But there's a better way. There are several packages on Hackage that implement functional references and lenses, which is definitely what you want to do. For example, with the fclabels package, you would put underscores in front of all your record names, then write
$(mkLabels ['BBTeam, 'Coach, 'Diet, 'BBPlayer])
addManStk = modify (+1) (steaks . diet . manager)
Edited in 2017 to add: these days there is broad consensus on the lens package being a particularly good implementation technique. While it is a very big package, there is also very good documentation and introductory material available in various places around the web.
What does if __name__ == __main__ : do?
Short Answer
It's boilerplate code that protects users from accidentally invoking the script when they didn't intend to. Here are some common problems when the guard is omitted from a script:
If you import the guardless script in another script (e.g.
import my_script_without_a_name_eq_main_guard
), then the latter script will trigger the former to run at import time and using the second script's command line arguments. This is almost always a mistake.If you have a custom class in the guardless script and save it to a pickle file, then unpickling it in another script will trigger an import of the guardless script, with the same problems outlined in the previous bullet.
Long Answer
To better understand why and how this matters, we need to take a step back to understand how Python initializes scripts and how this interacts with its module import mechanism.
Whenever the Python interpreter reads a source file, it does two things:
it sets a few special variables like
__name__
, and thenit executes all of the code found in the file.
Let's see how this works and how it relates to your question about the __name__
checks we always see in Python scripts.
Code Sample
Let's use a slightly different code sample to explore how imports and scripts work. Suppose the following is in a file called foo.py
.
# Suppose this is foo.py.
print("before import")
import math
print("before function_a")
def function_a():
print("Function A")
print("before function_b")
def function_b():
print("Function B {}".format(math.sqrt(100)))
print("before __name__ guard")
if __name__ == '__main__':
function_a()
function_b()
print("after __name__ guard")
Special Variables
When the Python interpreter reads a source file, it first defines a few special variables. In this case, we care about the __name__
variable.
When Your Module Is the Main Program
If you are running your module (the source file) as the main program, e.g.
python foo.py
the interpreter will assign the hard-coded string "__main__"
to the __name__
variable, i.e.
# It's as if the interpreter inserts this at the top
# of your module when run as the main program.
__name__ = "__main__"
When Your Module Is Imported By Another
On the other hand, suppose some other module is the main program and it imports your module. This means there's a statement like this in the main program, or in some other module the main program imports:
# Suppose this is in some other main program.
import foo
The interpreter will search for your foo.py
file (along with searching for a few other variants), and prior to executing that module, it will assign the name "foo"
from the import statement to the __name__
variable, i.e.
# It's as if the interpreter inserts this at the top
# of your module when it's imported from another module.
__name__ = "foo"
Executing the Module's Code
After the special variables are set up, the interpreter executes all the code in the module, one statement at a time. You may want to open another window on the side with the code sample so you can follow along with this explanation.
Always
It prints the string
"before import"
(without quotes).It loads the
math
module and assigns it to a variable calledmath
. This is equivalent to replacingimport math
with the following (note that__import__
is a low-level function in Python that takes a string and triggers the actual import):
# Find and load a module given its string name, "math",
# then assign it to a local variable called math.
math = __import__("math")
It prints the string
"before function_a"
.It executes the
def
block, creating a function object, then assigning that function object to a variable calledfunction_a
.It prints the string
"before function_b"
.It executes the second
def
block, creating another function object, then assigning it to a variable calledfunction_b
.It prints the string
"before __name__ guard"
.
Only When Your Module Is the Main Program
- If your module is the main program, then it will see that
__name__
was indeed set to"__main__"
and it calls the two functions, printing the strings"Function A"
and"Function B 10.0"
.
Only When Your Module Is Imported by Another
- (instead) If your module is not the main program but was imported by another one, then
__name__
will be"foo"
, not"__main__"
, and it'll skip the body of theif
statement.
Always
- It will print the string
"after __name__ guard"
in both situations.
Summary
In summary, here's what'd be printed in the two cases:
# What gets printed if foo is the main program
before import
before function_a
before function_b
before __name__ guard
Function A
Function B 10.0
after __name__ guard
# What gets printed if foo is imported as a regular module
before import
before function_a
before function_b
before __name__ guard
after __name__ guard
Why Does It Work This Way?
You might naturally wonder why anybody would want this. Well, sometimes you want to write a .py
file that can be both used by other programs and/or modules as a module, and can also be run as the main program itself. Examples:
Your module is a library, but you want to have a script mode where it runs some unit tests or a demo.
Your module is only used as a main program, but it has some unit tests, and the testing framework works by importing
.py
files like your script and running special test functions. You don't want it to try running the script just because it's importing the module.Your module is mostly used as a main program, but it also provides a programmer-friendly API for advanced users.
Beyond those examples, it's elegant that running a script in Python is just setting up a few magic variables and importing the script. "Running" the script is a side effect of importing the script's module.
Food for Thought
Question: Can I have multiple
__name__
checking blocks? Answer: it's strange to do so, but the language won't stop you.Suppose the following is in
foo2.py
. What happens if you saypython foo2.py
on the command-line? Why?
# Suppose this is foo2.py.
import os, sys; sys.path.insert(0, os.path.dirname(__file__)) # needed for some interpreters
def function_a():
print("a1")
from foo2 import function_b
print("a2")
function_b()
print("a3")
def function_b():
print("b")
print("t1")
if __name__ == "__main__":
print("m1")
function_a()
print("m2")
print("t2")
- Now, figure out what will happen if you remove the
__name__
check infoo3.py
:
# Suppose this is foo3.py.
import os, sys; sys.path.insert(0, os.path.dirname(__file__)) # needed for some interpreters
def function_a():
print("a1")
from foo3 import function_b
print("a2")
function_b()
print("a3")
def function_b():
print("b")
print("t1")
print("m1")
function_a()
print("m2")
print("t2")
- What will this do when used as a script? When imported as a module?
# Suppose this is in foo4.py
__name__ = "__main__"
def bar():
print("bar")
print("before __name__ guard")
if __name__ == "__main__":
bar()
print("after __name__ guard")
Common Ruby Idioms
The magic if clause that lets the same file serve as a library or a script:
if __FILE__ == $0
# this library may be run as a standalone script
end
Packing and unpacking arrays:
# put the first two words in a and b and the rest in arr
a,b,*arr = *%w{a dog was following me, but then he decided to chase bob}
# this holds for method definitions to
def catall(first, *rest)
rest.map { |word| first + word }
end
catall( 'franken', 'stein', 'berry', 'sense' ) #=> [ 'frankenstein', 'frankenberry', 'frankensense' ]
The syntatical sugar for hashes as method arguments
this(:is => :the, :same => :as)
this({:is => :the, :same => :as})
Hash initializers:
# this
animals = Hash.new { [] }
animals[:dogs] << :Scooby
animals[:dogs] << :Scrappy
animals[:dogs] << :DynoMutt
animals[:squirrels] << :Rocket
animals[:squirrels] << :Secret
animals #=> {}
# is not the same as this
animals = Hash.new { |_animals, type| _animals[type] = [] }
animals[:dogs] << :Scooby
animals[:dogs] << :Scrappy
animals[:dogs] << :DynoMutt
animals[:squirrels] << :Rocket
animals[:squirrels] << :Secret
animals #=> {:squirrels=>[:Rocket, :Secret], :dogs=>[:Scooby, :Scrappy, :DynoMutt]}
metaclass syntax
x = Array.new
y = Array.new
class << x
# this acts like a class definition, but only applies to x
def custom_method
:pow
end
end
x.custom_method #=> :pow
y.custom_method # raises NoMethodError
class instance variables
class Ticket
@remaining = 3
def self.new
if @remaining > 0
@remaining -= 1
super
else
"IOU"
end
end
end
Ticket.new #=> Ticket
Ticket.new #=> Ticket
Ticket.new #=> Ticket
Ticket.new #=> "IOU"
Blocks, procs, and lambdas. Live and breathe them.
# know how to pack them into an object
block = lambda { |e| puts e }
# unpack them for a method
%w{ and then what? }.each(&block)
# create them as needed
%w{ I saw a ghost! }.each { |w| puts w.upcase }
# and from the method side, how to call them
def ok
yield :ok
end
# or pack them into a block to give to someone else
def ok_dokey_ok(&block)
ok(&block)
block[:dokey] # same as block.call(:dokey)
ok(&block)
end
# know where the parentheses go when a method takes arguments and a block.
%w{ a bunch of words }.inject(0) { |size,w| size + 1 } #=> 4
pusher = lambda { |array, word| array.unshift(word) }
%w{ eat more fish }.inject([], &pusher) #=> ['fish', 'more', 'eat' ]
Multiple IFs with the same return
It all comes down to how the compiler compiles the code. For all practical purposes, they are identical. (As long as you make sure to use short-curcuited OR "||")
Named Parameter idiom in Java
The best Java idiom I've seem for simulating keyword arguments in constructors is the Builder pattern, described in Effective Java 2nd Edition.
The basic idea is to have a Builder class that has setters (but usually not getters) for the different constructor parameters. There's also a build()
method. The Builder class is often a (static) nested class of the class that it's used to build. The outer class's constructor is often private.
The end result looks something like:
public class Foo {
public static class Builder {
public Foo build() {
return new Foo(this);
}
public Builder setSize(int size) {
this.size = size;
return this;
}
public Builder setColor(Color color) {
this.color = color;
return this;
}
public Builder setName(String name) {
this.name = name;
return this;
}
// you can set defaults for these here
private int size;
private Color color;
private String name;
}
public static Builder builder() {
return new Builder();
}
private Foo(Builder builder) {
size = builder.size;
color = builder.color;
name = builder.name;
}
private final int size;
private final Color color;
private final String name;
// The rest of Foo goes here...
}
To create an instance of Foo you then write something like:
Foo foo = Foo.builder()
.setColor(red)
.setName("Fred")
.setSize(42)
.build();
The main caveats are:
- Setting up the pattern is pretty verbose (as you can see). Probably not worth it except for classes you plan on instantiating in many places.
- There's no compile-time checking that all of the parameters have been specified exactly once. You can add runtime checks, or you can use this only for optional parameters and make required parameters normal parameters to either Foo or the Builder's constructor. (People generally don't worry about the case where the same parameter is being set multiple times.)
You may also want to check out this blog post (not by me).
Related Topics
Check for Installed Packages Before Running Install.Packages()
Twitter, Roauth and Windows: Register Ok, But Certificate Verify Failed
Define $ Right Parameter with a Variable in R
How to Make R Beep/Play a Sound at the End of a Script
Problems When Trying to Load a Package in R Due to Rjava
Count Number of Columns by a Condition (>) for Each Row
R - Add Column That Counts Sequentially Within Groups But Repeats for Duplicates
Changing Fonts for Graphs in R
Remove Duplicates Keeping Entry with Largest Absolute Value
How to Initialize Empty Data Frame (Lot of Columns at the Same Time) in R
How to Wait for a Keypress in R
How to Produce Stacked Bars Within Grouped Barchart in R
R: How to Handle Times Without Dates
Automatically Delete Files/Folders
Ggplot2: Change Order of Display of a Factor Variable on an Axis
Remove All of X Axis Labels in Ggplot
Why am I Getting X. in My Column Names When Reading a Data Frame
In R Markdown in Rstudio, How to Prevent the Source Code from Running Off a PDF Page