How to pick only efficient frontier points in a plot of portfolio performance?
This uses the function cummax
to identify a series of qualifying points by then testing against the original data:
> data <- data[order(data$Stdev),]
> data[ which(data$AvgReturn == cummax(data$AvgReturn)) , ]
Stdev AvgReturn
24 0.81 0.21
19 0.88 0.22
13 0.96 0.24
9 1.05 0.27
5 1.16 0.30
3 1.39 0.31
2 1.53 0.34
1 1.92 0.35
> plot(data)
> points( data[ which(data$AvgReturn == cummax(data$AvgReturn)) , ] , col="green")
It's not actually the convex hull but what might be called the "monotonically increasing hull".
Using R to print the pareto solutions of a dataset
Here's a solution using chull
(convex hull of a set of points). Warning: untested.
# assume d is your matrix or data frame of points
ch <- d[chull(d), ]
ch[apply(ch, 1, function(p) !any(ch[, 1] < p[1] & ch[, 2] < p[2])), ]
You could extend this to >2 dimensions by replacing chull
with your own convex hull algorithm, say one of those from here, and extending the conditional test to fit.
Can I use arbitrary metrics to search KD-Trees?
The nearest-neighbour search procedure described on the Wikipedia page you linked to can certainly be generalised to other distance metrics, provided you replace "hypersphere" with the equivalent geometrical object for the given metric, and test each hyperplane for crossings with this object.
Example: if you are using the Manhattan distance instead (i.e. the sum of the absolute values of all differences in vector components), your hypersphere would become a (multidimensional) diamond. (This is easiest to visualise in 2D -- if your current nearest neighbour is at distance x from the query point p, then any closer neighbour behind a different hyperplane must intersect a diamond shape that has width and height 2x and is centred on p). This might make the hyperplane-crossing test more difficult to code or slower to run, however the general principle still applies.
How to plot multi-objectives pareto frontier with DEAP in Python
Following a recipe in this link (not my own) to calculate the Pareto Points you could do:
def simple_cull(inputPoints, dominates):
paretoPoints = set()
candidateRowNr = 0
dominatedPoints = set()
while True:
candidateRow = inputPoints[candidateRowNr]
inputPoints.remove(candidateRow)
rowNr = 0
nonDominated = True
while len(inputPoints) != 0 and rowNr < len(inputPoints):
row = inputPoints[rowNr]
if dominates(candidateRow, row):
# If it is worse on all features remove the row from the array
inputPoints.remove(row)
dominatedPoints.add(tuple(row))
elif dominates(row, candidateRow):
nonDominated = False
dominatedPoints.add(tuple(candidateRow))
rowNr += 1
else:
rowNr += 1
if nonDominated:
# add the non-dominated point to the Pareto frontier
paretoPoints.add(tuple(candidateRow))
if len(inputPoints) == 0:
break
return paretoPoints, dominatedPoints
def dominates(row, candidateRow):
return sum([row[x] >= candidateRow[x] for x in range(len(row))]) == len(row)
import random
inputPoints = [[random.randint(70,100) for i in range(3)] for j in range(500)]
paretoPoints, dominatedPoints = simple_cull(inputPoints, dominates)
import matplotlib.pyplot as plt
from mpl_toolkits.mplot3d import Axes3D
fig = plt.figure()
ax = fig.add_subplot(111, projection='3d')
dp = np.array(list(dominatedPoints))
pp = np.array(list(paretoPoints))
print(pp.shape,dp.shape)
ax.scatter(dp[:,0],dp[:,1],dp[:,2])
ax.scatter(pp[:,0],pp[:,1],pp[:,2],color='red')
import matplotlib.tri as mtri
triang = mtri.Triangulation(pp[:,0],pp[:,1])
ax.plot_trisurf(triang,pp[:,2],color='red')
plt.show()
, you will notice that the last part is applying a triangulation to the Pareto points and plotting it as a triangular surface. The results is this (where the red shape is the Pareto front):
EDIT: Also you might want to take a look at this (although it seems to be for 2D spaces).
Related Topics
Removing Particular Character in a Column in R
How to Define Fill Colours in Ggplot Histogram
Force Facet_Wrap to Fill Bottom Row (And Leave Any "Gaps" in the Top Row)
Extract Date from Given String in R
How to Get the Min/Max Possible Numeric
Using R to Do a Regression with Multiple Dependent and Multiple Independent Variables
How to Plot a List of Vectors with Different Lengths
Insert Function Variable into Graph Title
How to Deploy Shiny App That Uses Local Data
Convert a Mm-Yy String "Jan-01" into Date Format
Renaming and Hiding an Exported Rcpp Function in an R Package
Control Font Thickness Without Changing Font Size
How to Use With/Within Inside a Function
Additional Metrics in Caret - Ppv, Sensitivity, Specificity
How to Suppress Warnings from Stats:::Regularize.Values