Skip links

Python代写|Computer Vision CSCI-GA.2272-001 Assignment 3



This assignment explores various methods for aligning images and feature
extraction. There are four parts to the assignment:

1. Image alignment using RANSAC { Solve for an affine transformation
between a pair of images using the RANSAC fitting algorithm. [30

2. Estimating Camera Parameters { using a set of 3D world points and
their 2D image locations, estimate the projection matrix P of a camera.
[35 points].

3. Structure from Motion { infer the 3D structure of an object, given a
set of images of the object. [20 points].

Please also download the file from the course webpage as
it contains images and code needed for the assignment.


You may perform this assignment in the language of your choice, but Python
is strongly recommended as it is a high-level languages with much of the
required funtionality built-in.

1 Image Alignment

In this part of the assignment you will write a function that takes two images
as input and computes the affine transformation between them. The overall
scheme, as outlined in lecture 12, is as follows:

• Find local image regions in each image
• Characterize the local appearance of the regions
• Get set of putative matches between region descriptors in each image
• Perform RANSAC to discover best transformation between images

The first two stages can be performed using David Lowe’s SIFT feature
detector and descriptor representation. A Python version can be found in the
OpenCV-Python environment (http://opencv-python-tutroals.readthedocs.

html) and also at

The two images you should match are contained in the
file: scene.pgm and book.pgm, henceforth called image 1 and 2 respectively.

You should first run the SIFT detector over both images to produce a set
of regions, characterized by a 128d descriptor vector. Display these regions
on each picture to ensure that a satsifactory number of them have been
extracted. Please include the images in your report.

The next step is to obtain a set of putative matches T. This should be
done as follows: for each descriptor in image 1, compute the closest neighbor
amongst the descriptors from image 2 using Euclidean distance. Spurious
matches can be removed by then computing the ratio of distances between
the closest and second-closest neighbor and rejecting any matches that are
above a certain threshold. To test the functioning of RANSAC, we want to
have some erroneous matches in our set, thus this threshold should be set to
a fairly slack value of 0.9. To check that your code is functioning correctly,
plot out the two images side-by-side with lines showing the potential matches
(include this in your report).

The final stage, running RANSAC, should be performed as follows:

• Repeat N times (where N is ∼100):

• Pick P matches at random from the total set of matches T. Since we
are solving for an affine transformation which has 6 degrees of freedom,
we only need to select P=3 matches.

• Construct a matrix A and vector b using the 3 pairs of points as de
scribed in lecture 12.

• Solve for the unknown transformation parameters q. In Python you
can use linalg.solve.

• Using the transformation parameters, transform the locations of all T
points in image 1. If the transformation is correct, they should lie close
to their pairs in image 2.

• Count the number of inliers, inliers being defined as the number of
transformed points from image 1 that lie within a radius of 10 pixels
of their pair in image 2.

• If this count exceeds the best total so far, save the transformation
parameters and the set of inliers.

Leave a comment