Tải bản đầy đủ - 0 (trang)
APPENDIX 8. A SOLVENT CONTENT, ENVELOPE DEFINITION, AND SOLVENT MODELLING

APPENDIX 8. A SOLVENT CONTENT, ENVELOPE DEFINITION, AND SOLVENT MODELLING

Tải bản đầy đủ - 0trang

Envelope definition

where dprot is the protein density (g cm−3 ). If this is assumed equal to 1.35,

then

V p = 1. 23/VM , V



solv



= 1 − V p.



(8.A.2)



8.A.2 Envelope definition

Wang (1985) proposed an automatic cyclic procedure for defining a mask,

which should separate the map into solvent and molecule space, in accordance with the ratio fixed by the Matthews criterion. The Wang procedure may

be described as follows:

(a) The current electron density map (no matter if it has been obtained by

ab initio or non-ab initio techniques) is truncated according to:

ρtrunc (r) = ρ(r) if ρ(r) > ρsolv ,



ρtrunc (r) = 0 if ρ(r) ≤ ρsolv ,



where the threshold, ρsolv , is chosen to meet the expected solvent content.

(b) ρtrunc is smoothed (into ρsm (r)) by associating, at each point r of ρtrunc ,

the weighted average density over the points included in an encompassing

sphere of radius R (between 8 and 4 Å, according to the resolution or, also,

to the quality of the structure):

ρsm (r) =



r



w(r − r )ρtrunc (r ),



(8.A.3)



where

w(r − r ) = 1 − d(r )/R for d < R,



w(r − r ) = 0 for d > R.



d = |r − r | is the distance between points r and r .

(c) A cut-off value, ρcut , is calculated, which divides the unit cell into two

regions, solvent and protein; solvent pixels are marked by the condition

ρsm (r) ≤ ρcut , voids internal to the molecular envelope are polished.

(d) A solvent corrected map is obtained by setting all the values outside the

protein envelope to a low constant value; the electron density values inside

the molecular envelope are set to the current values (say, to the values

defined at point a).

(e) New phases are obtained by Fourier inversion of the solvent corrected map. Often such phases are combined with the experimental phases

(those obtained via SAD-MAD, SIR-MIR, or MR techniques). The corresponding electron density map is the new basis for the application of

point a.

Two years later, Leslie (1987) observed that (8.A.3) is a convolution, and that

flattening may be more easily performed via its Fourier transform:

T[ρsm (r)] = Fsm (h) = T[w]T[ρtrunc ] = g(s) · Ftrunc ,

where Ftrunc is readily calculated by Fourier inversion of the truncated map.

g(s) is the Fourier transform of the weight function, sum of two components,

the first of which is the Fourier transform of a sphere. According to James

(1962),

g(s) = Y(uR) − Z(uR),



191



192



Phase improvement and extension

where s = 2 sin θ/λ, u = 2π s,

Y(x) = 3(sin x − x cos x)/x3 ,

and

Z(x) = 3[2x sin x − (x2 − 2) cos x − 2]/x4 .

Leslie’s procedure improved the efficiency of the flattening technique and

dramatically reduced the computing time.

It may be useful to mention that several attempts have been made to estimate the protein envelope at very low resolution (say, about 8 Å or worse).

The necessary prior information consists of unit cell parameters, space group,

high quality diffraction data, complete up to a fixed resolution, and a rough

estimate of the solvent fraction. Attempts began with Kraut (1958). Somewhat

later, different algorithms were proposed, summarized as follows: the histogram method (Luzzati et al., 1988; Mariani et al., 1988; Lunin et al., 1990),

the condensing protocol (Subbiah, 1991, 1993; David and Subbiah, 1994), the

one sphere method (Harris, 1995), FAM (Lunin et al., 1995; Moras et al., 1983;

Urzhumtsev et al., 1996). The general idea at the basis of all these algorithms

was to define at very low resolution a rough envelope, which may be easier

than at high resolution. Once a model envelope is obtained, phase extension

at higher resolution should be performed, mainly via solvent flattening, histogram matching, etc. to progressively improve identification of the solvent

region, and then allow solution of the protein structure.

The above methods were able to find good (even if rough) envelope models,

but their weak point was the phase extension, quite difficult from very low

resolution. In recent years these methods have been shelved, however it may

be that in the future their appeal will again increase.



8.A.3 Models for the bulk solvent

The narrow boundary region (within a 7 Å boundary layer) between the protein and the solvent exhibits an ordered structure of strongly bonded water

molecules. As a rule of thumb, about one water molecule per residue belongs

to such an ordered substructure (Kleywegt and Jones, 1997a). The solvent is

disordered beyond this shell, and solvent flattening techniques use this characteristic property to improve the protein phases. Since the bulk solvent may

significantly contribute to the structure factors, taking into account its contribution may improve agreement between calculated and observed structure

factors; this may be useful both in the refinement step, and in the phasing step

itself (e.g. in the translation step of molecular replacement).

The effect of the solvent on structure factors may be understood as follows:

cancelling the solvent contribution from the calculated structure factor is equivalent to setting the electron density of the bulk solvent to zero. This implies

an infinitely sharp contrast between protein surface and solvent, with an overestimate of the low resolution structure factor amplitudes. We will quote two

models for the solvent:



Histogram matching

1. Exponential bulk solvent. This is based on the Babinet principle, according to which, if the unit cell is divided in two parts, one relative to the

disordered solvent and the other to the protein molecule, then Fsolv = −FP ,

where the first structure factor refers to the solvent volume and the second to

the protein volume. To understand the above relationship, we notice that, by

a property of the Fourier transform, a unit cell with constant electron density will show vanishing structure factor amplitudes, except for F000 . The

contribution of the solvent bulk is therefore opposite to that of the protein

volume, and will tend to weaken the amplitudes of the latter. An approximation to solvent scattering may be achieved by placing atoms with very high

temperature factor (e.g. 200 Å2 ) in the solvent region. The effects of the

above model may be represented by calculating the total structure factor,

Ft , as (Glikos and Kokkinidis, 2000b)

Ft (h) = FP (h)[1 − ksol exp(−Bsol s2 /4)],

where s = 2 sin θ/λ and ksol is the ratio between the mean electron densities

of the solvent and of the protein. Since:

(i) the electron density of water is about 0.334 e− /Å3 , and that for a salt

solution may be estimated at around 0.40 e− /Å3 ;

(ii) the protein density may be estimated as close to 0.439 e− /Å3 ,

then ksol may be approximated to 0.76. If we choose Bsol ≈ 200 Å2 , the

effects of the solvent will disappear rapidly at higher resolution.

2. Flat bulk solvent. A flat mask is used as the solvent model; it is located

into the solvent region, at a distance of about 1.4 Å from the van der Waals

surface of the protein. The bulk solvent region is then uniformly filled by a

continuous electron density, which contributes to the total structure factor,

in accordance with (Jiang and Brunger, 1994),

Ft (h) = FP (h) + ksol Fsol (h) exp(−Bsol s2 /4)]



(8.A.4)



The residual between the observed and the solvent corrected structure factors,

Ft , provides optimal values for the parameters ksol and Bsol (typical values are,

ksol ≈ 0.4 and Bsol ≈ 45 Å2 ). This kind of bulk solvent correction is implemented in several refinement programs and is also used to improve the efficiency

of the translation step in MR programs.



A P P E N D I X 8 . B H I S T O G R A M M AT C H I N G

This technique is widely used in image processing; it aims to improve the

image quality by fitting the density distribution of an image with the ideal

distribution. From this point of view, the electron density is an image of the

crystal structure, the quality of which should be improved by fitting the density

frequency with standard distributions.

The actual form of a histogram depends on several parameters, among which

are:

1. the fraction of the unit cell volume occupied by the solvent;

2. the resolution at which the diagram is calculated;



193



194



Phase improvement and extension

3. the mean phase error associated with the structure factors;

4. the overall temperature factor.

To circumvent the effects of the temperature parameter, histogram matching

procedures remove the overall temperature factor from all the |F|s. This allows

simplification of the method, since it is not necessary to use different standard

histograms for different temperature factors. Accordingly, the standard histogram, which is relative to the frequency distribution of the density in the

protein region, may be treated as a function of the resolution only. It may be

obtained from the electron density map of a similar known structure or from a

formula. Main (1990a,b) (see also Lunin and Skovoroda, 1991) has developed

a six-parameter formula which produces useful histograms over a range of

resolutions from 4.5 to 0.9 Å. Histograms are calculated by considering the

densities only within the molecular envelope. We note that:



P(ρ)



ρ

Fig. 8.B.1

Electron density histograms obtained

from refined phases ( —— ) and from

approximated phases ( - - - ).



1. The flatness of the histogram increases with the average phase error. In

Fig. 8.B.1 we overlap the histogram corresponding to the refined structure

with one obtained from approximate phase values.

2. Histograms are asymmetric; the asymmetry is a consequence of the positivity of the electron density (negative density values are less frequent

than positive ones) and may be used as a criterion for phase correction

(Podjarny and Yonath, 1977). On the other hand, the negative regions must

be present in the histograms because they are generated by unavoidable

series termination errors. Skewness, say,

γ =<



P(ρ)



ρ

Fig. 8.B.2

Electron density profile variation with

resolution.



ρ− < ρ >

σd



3



>



with



σd = < (ρ− < ρ >)2 >1/2

,



is usually calculated to evaluate the asymmetry; it can be positive or negative, or undefined. Negative skewness values indicate that the tail on the

left-hand side of the probability density function is longer than on the righthand side; a positive skewness indicates that the tail on the right-hand side

is longer than on the left-hand side; a zero value indicates that the values

are relatively evenly distributed on both sides of the mean. In our case,

skewness is expected to be positive.

3. The histogram changes with the data resolution (see Fig. 8.B.2). The histogram for high resolution maps has its maximum close to ρ = 0; for low

resolution maps, the maximum shifts to higher values of ρ, and the peak

is broader. The peak of the histogram lowers to a minimum at about 3 Å

resolution; as the resolution decreases, the peak rises again, moves towards

higher density, and becomes broader. Long tails towards high density are

present in high resolution maps.

4. The histogram matching technique may be applied as follows (Zhang and

Main, 1990a,b):

(i) From a given set of B-parameter corrected structure factors, the Fourier

synthesis and the corresponding histogram are calculated. The latter is

compared with the standard histogram.

(ii) The electron density histogram of the actual map is divided into smaller

areas with boundaries, ρi , i = 1, . . . , n (n ∼ 100) (see Fig. 8.B.3a). The



Histogram matching



195



standard histogram is also divided into smaller areas, with boundaries,

ρ i , i = 1, . . . , n (see Fig. 8.B.3b).

(iii) Scale factors ai and shifts bi are calculated to map ρ into ρ for the ith

interval:

ρ = ai ρ + bi



(8.B.1)



where

ai =



ρ i+1 + ρ i

,

ρi + 1 − ρi



bi =



ρ i ρi + 1 − ρ i + 1 ρi

.

ρi + 1 − ρi



ρi



ρi+1

a)



For example, if only a scale factor k relates the two maps (e.g. ρ = kρ),

then

ai = k and bi = 0 for any i.

If only a shift relates the two maps (e.g. ρ = ρ + b), then

ai = 1 and bi = k for any i.

(iv) The operation (8.B.1) is applied to the actual map for each interval; the

new map will show the same density distribution as that expected.

(v) A new set of structure factors is calculated from the modified electron

density, whose phases are employed for a next cycle.

A more intuitive approach is to let P(ρ) and Ps (ρ) be the current and the

standard reference density histograms, respectively (both sum to unity), and

N(ρ) and Ns (ρ) the corresponding cumulative distributions. The transformation of P(ρ) into Ps (ρ) is made as follows. For any density value P(ρ), the

corresponding point in N(ρ) is calculated; this is mapped in Ns (ρ), and the

desired modified value in the standard distribution is obtained by inverting the

cumulative standard distribution:

ρ = Ns−1 [Ns (ρ)].

Histogram matching is usefully combined with solvent flattening techniques

as follows:

(a) The molecular envelope is obtained.

(b) The solvent region is flattened, while the density within the molecular

envelope is matched with the expected histogram. Obviously, histogram

matching efficiency is high when the solvent region is a small percentage of the unit cell. When the reverse condition occurs, solvent flattening

effects are dominant.

(c) Structure factors are calculated from the above modified map and their

phases are (eventually) combined with experimental phases. If a phase

extension process is started, the extended phases are accepted at the

calculated values.

(d) A new map is calculated using data obtained at step (c), and the procedure

is repeated from step (a) until convergence is obtained.

It is obvious that histogram matching and solvent flattening procedures are

not able to suppress all false peaks from an experimental electron density map

and/or generate all the supplementary peaks to complete the structure. Indeed,



ρ′i+1



ρ ′i

b)



Fig. 8.B.3

(a) Electron density histogram for the

actual electron density model; (b) standard electron density histogram.



196



Phase improvement and extension

false density inside the envelope tends to remain, while molecular density outside the envelope may remain strongly depressed. However, these procedures

are able to produce remarkable improvements in the maps.



APPENDIX 8.C A BRIEF OUTLINE OF THE

A R P /w A R P P R O C E D U R E

The automatic refinement procedure (ARP) is based on a free atom approach.

A set of dummy atoms is created, new atoms are added and old ones are

deleted to create new models which, cyclically re-evaluated, should end in a

final model describing electron density and target protein. The procedure may

be described schematically as follows.

Dummy atoms (of the same atomic species, say O) are located in the high

density regions of the best available electron density map; a fine grid of about

0.25 Å is used. The initial model is gradually expanded; the density threshold is

gradually lowered and additional atoms are located at bonding distances from

existing atoms. The model is completed when the number of dummy atoms is

about 3 x NAT, where NAT is the expected number of atoms. At the end of the

updating process (see below) the number of atoms is reduced to about 1.2 x

NAT.

The model is updated as follows.

(i) Atom rejection. Hybrid electron densities of type 3F o –2Fc are calculated;

an atom is removed on the basis of the density at the atomic centre,

on shape criteria (e.g. sphericity), and distance criteria (e.g. too close to

accepted atoms).

(ii) Atom addition. The F o –Fc synthesis is calculated; the grid point with the

highest density value is selected as a new atomic position, provided that

this satisfies defined distance constraints in relation to other positioned

atoms. Grid points at small distance from this added atom are rejected

and the next higher grid point is selected.

(iii) Model refinement. This may be performed by a cyclic procedure based

on unrestrained least squares or maximum likelihood refinement (in both

the cases the procedure aims at matching calculated to observed structure

factors), and/or by real-space refinement (an atom may be moved from the

peak position on the basis of a density shape analysis around it). Usually,

the reciprocal space refinement is performed by REFMAC (Murshudov

et al., 1996); it needs a number of observations, greater than the number

of model parameters, which then sets the resolution limit of ARP/wARP

to about 2.5 Å.

So far no attempt has been made to establish a chemical sense to the atoms,

in terms of atomic species, bond distances, bond angles, protein secondary

structures, etc.; typically, free atoms lie within 0.5–0.6 Å of the corresponding

positions in the correct structure.

Following this, model reconstruction starts; its task is to discard atoms in

false positions, assign atomic species to the well-located atoms, and to establish their connectivity. Only when atomic species, bonds, and angles for a



A brief outline of the ARP/wARP procedure

group of atoms have been defined (see below), will stereochemical restraints

be applied in restrained refinements; this will improve the ratio of observations

to parameters, and will increase the efficiency of the least squares refinement

(which will then become hybrid, because free and restrained atomic positions

will coexist).

Model reconstruction starts with identification of the main-chain atoms.

Every Cα atom should stay at 3.8 Å from at least one other candidate Cα atom,

which may be connected to the first one by a forward (outgoing) directionality

(– C(= O) –N −Cα ) or by a incoming (backward) directionality [N—C(= O)

−Cα ]. If two atoms i and j are Cα candidates (see Fig. 8.C.1), then a peptide unit plane is placed among the candidates and rotated about the i–j axis.

If, for a given rotation angle, the interpolated electron density at the peptide

atomic positions is larger than a given threshold, atoms i and j are flagged as Cα

atoms.

The electron density maps to which AMB algorithms are applied usually

show non-negligible phase errors; therefore, the condition according to which

two consecutive Cα atoms should lie at 3.8 Å apart must be replaced by a more

permissive condition, say, the distance should lie in a range (e.g. 3.8 ± 1 Å).

The result is that many candidates may be connected by more than one

incoming and one outgoing connection, with the consequent combinatorial

explosion of the possible chains. ARP/wARP solves the problem by dividing each candidate chain into small structural subunits and by evaluating,

by stereochemical arguments, the probability of each subunit being the correct one. Subunits consisting of four consecutive Cα atoms are used, say

Cα (n) − Cα (n + 1) − Cα (n + 2) − Cα (n + 3), and the two-dimensional frequency distributions of the angle Cα (n) − Cα (n + 1) − Cα (n + 2) and of the

dihedral angle Cα (n) − Cα (n + 1) − Cα (n + 2) − Cα (n + 3) are tested against

the distribution derived from database analysis (Oldfield and Hubbard, 1994;

Kleywegt, 1997). This information is of a three-dimensional nature, and may

be used to obtain a score for the subunit (of length four) parameters. The main

chain is built by overlapping the last three atoms of one subunit with the first

three of the following. The chain scores are then obtained by summation of the

subunit scores.

Limited data resolution and quality of the phases, combined with the natural conformational flexibility of the chain, may not allow recovery of a full

continuous chain; several main-chain fragments may be obtained, separated

by gaps, and some chain fragments may be wrongly identified. The lower the

quality of the starting electron density map and data resolution, the larger the

probability of having a large number of gaps.

Once one or more main-chain fragments have been correctly identified, side

chains may be built by taking into account the Cα positions, the density distribution in the map, and connectivity criteria; the aim is to dock the polypeptide

fragments into the sequence (assumed to be known). A score is associated with

each possible docking position, so that the chain would have the most probable

side chain conformation.



197

a)



Cα (n+1)

N

C



O

Cα (n)



b)

Cα (n+1)

O

C



N

Cα (n)

Fig. 8.C.1

Connection between two candidate Cα

atoms. The peptide plane is located

between these atoms and rotates about

the axis Cα (n) −Cα (n + 1). The two candidates are flagged as Cα atoms if, for a

given orientation of the plane, the interpolated density values at the atomic positions are larger than a given threshold.

(a) Forward directionality; (b) backward

directionality.



9



Charge flipping and VLD

(vive la difference)



9.1 Introduction

Direct methods procedures (see Chapter 6) or Patterson techniques (see

Chapter 10), primarily the former, have been methods of choice for crystal

structure solution of small- to medium-sized molecules from diffraction data.

Over the last 30 years, several new phasing algorithms have been proposed,

not requiring the use of triplet and quartet invariants, but based only on the

properties of Fourier transforms. These were not competitive with direct methods and have never became popular, but they contain a nucleus for further

advances. Among these we mention:

(i) Bhat (1990) proposed a Metropolis technique (Metropolis et al., 1953;

Kirkpatrick et al., 1983; Press et al., 1992), also known as simulated

annealing (the reader is referred to Section 12.9 for details on the

algorithm). From a random set of phases, an electron density map is

calculated, modified, and inverted. The corresponding phases are altered

according to the simulated annealing algorithm, and then used to calculate

a new electron density map. The procedure is cyclic.

(ii) A strictly related simulated annealing procedure has been proposed by Su

(1995). The objective function to minimize was

R=



h



(S|Fh |calc − |Fh |obs )2 ,



where S is the scale factor. The scheme is as follows: random atomic positions are generated and in succession shifted; the simulated annealing

algorithm is applied to accept or reject atomic shifts. At the end, a new

atomic structure is generated, whose positions are shifted in succession,

and so on in a cyclic way.

(iii) The forced coalescence method (FCP) was proposed by Drendel et al.

(1995). Hybrid electron density maps (see Section 7.3.4) were actively

used with different values of τ and ω.

Even if never popular, the above algorithms opened the way to two other

methods which are much more efficient, charge flipping and VLD (vive la difference), to which this chapter is dedicated. Both are based on the properties of



The charge flipping algorithm



199



the Fourier transform; they do not require the explicit use of structure invariants

and seminvariants, or a deep knowledge of their properties. The reader should

not, however, conclude that the invariance and seminvariance concepts are not

necessary in the handling of these approaches, on the contrary, understanding

these basic concepts is essential to the appreciation of these new methods.

To be more clear, when an electron density is modified, a model map is

simultaneously identified; and when the model map is Fourier inverted, model

structure factors with modulus |Fp | and phase φ p are obtained. The reliability of the new phases is usually calculated via the distribution P(R, Rp , φ, φp ),

described in Section 7.2, which involves estimation of the two-phase structure

invariants, (φh − φph ).



9.2 The charge flipping algorithm

Charge flipping was developed by Oszlányi and Suto (2004, 2005, 2008) and

has been successfully applied to small molecules (Wu et al., 2004; Palatinus

and Chapuis, 2007), modulated structures (Palatinus et al., 2006), powder data

(Baerlocher et al., 2007a,b), high resolution protein data (Dumas and van der

Lee, 2008). We describe the algorithm step by step (see Fig. 9.1):

1. The list of unique reflections, as fixed by the space group symmetry, is

expanded in P1 to produce a complete list of reflections; Friedel pairs, if

present, are merged.

2. Random starting phases are assigned to the expanded list of reflections.

3. An electron density map is calculated over a grid with spacing adjusted

to RES/2. It may be seen from Fig. 9.2a that, at least at high resolution,

large density values are restricted to a small percentage of pixels, which

therefore carry almost all of the structural information. Figure 9.2a is a

different way of representing the density distributions shown previously in

Figs. 8.B.1 and 8.B.2.

4. The electron density is modified so that all the pixels with density smaller

than a given positive threshold δ (see Fig. 9.2b) are submitted to flipping

(i.e. their density is multiplied by −1). In Fig. 9.1, the modified map is

called ρ mod .

5. The inverse Fourier transform of ρ mod is treated as follows: for large amplitudes (about 80% of the total number), calculated phases are associated

with observed amplitudes, for weak reflections, the calculated modulus

{|F |, φ}



FT



ρ (r)



{|F |, φmod}



MOD



FT –1



ρ mod(r)



{|Fmod |, φmod}



Fig. 9.1

Charge flipping algorithm. {|F|} is the

set of observed reflections, {φ} is the set

of random phases. FT and FT−1 indicate

the direct and the inverse Fourier transforms, respectively, MOD is the function

used to modify the electron density, ρ(r).

{|F| mod } and {φ mod } are the structure

factor amplitudes and phases obtained by

Fourier inversion of ρ mod .



Charge flipping and VLD (vive la difference )



200

density



(a)



0.75



is retained and the phase shifted by π/2. A new electron density map is

calculated and the cycle starts again.

Let us first explain the source of the algorithm name. The total charge in a map

is assumed to be



0.50

0.25



ctot =

–0.25



i



ρi ,



where i varies over all grid points and the flipped charge is defined as

cflip =

density



(b)



0.75

0.50

0.25

d

–0.25



Fig. 9.2

Typical high-resolution electron density

distribution, sorted in descending order.

The number of pixels is in the abscissa.



ρi <δ



|ρi |,



where the summation goes over all points satisfying the condition ρi < δ.

The various applications showed that, for proper values of δ, the ratio cflip /ctot

should lie at around 0.9. As a rule of thumb, it should roughly correspond to

inverting the low density pixels shown in Fig. 9.2b. Flipping the density in

this region modifies the electron density distribution and allows us to explore

the phase space efficiently. Giacovazzo and Mazzone (2011) observed that the

flipped region corresponds to that with the largest values of electron density variance. As in other EDM techniques, the algorithm modifies a model

without destroying it; in this way the region to reverse is not part of the solution, otherwise the convergence would never be reached. When a good model

is obtained, reversing the sign of the density for high variance pixels provides

negligible perturbation of the model, which cannot be destroyed.

δ is the most critical parameter; it usually changes during the phasing

process and sometimes has to be tuned to lead the algorithm to succeed.

During each charge flipping cycle, the crystallographic residual and the

skewness coefficient of the density map (see Appendix 8.B) are calculated.

The convergence is assumed to be reached when a sharp increase in relative

skewness occurs, accompanied by a drop in the crystallographic residual. The

phasing process may usually be subdivided into an initial transient step, a long

stagnation period where the phase space is extensively explored, a stable state

where a sharp increase in the relative skewness occurs, accompanied by a drop

in the crystallographic residual. Such a sharp improvement in the figure of

merit denotes that convergence has been reached.

Charge flipping solves the structures in P1. This trick was first applied by

Sheldrick and Gould (1995), when solution in the correct space group was not

being successful; it was later adopted by other authors. The main advantage

of solving a structure in P1 is that the restraints imposed by symmetry on the

phase values are relaxed and the phases may sometimes converge smoothly

to the correct values. However, the use of P1 remained infrequent for direct

methods; indeed, symmetry is important prior information which should not

generally be suppressed. Charge flipping, however, renounces this information;

it is not clear why, but its efficiency decreases dramatically when phasing is

attempted using the correct space group symmetry.

In accordance with the above observations, the charge flipping crystal structure solution step is followed by a second step, restating the correct space group

symmetry; i.e. it locates the space group symmetry elements in the P1 density

map. A technique is therefore necessary to automatically find the shift between

the origin of the P1 map and the conventional origin of the space group. This



The VLD phasing method

process has to be accompanied by density averaging over the symmetry equivalent (in the correct space group) grid points. The algorithm used for returning

to the correct space group is similar to the RELAX procedure developed by

Burla et al. (2000), and later improved by Caliandro et al. (2007a). Since

RELAX also plays an important role in the VLD approach, we describe it in

Appendix 9.B.



9.3 The VLD phasing method

In Section 7.2, we assumed that a model structure is available; to deal in an

optimal way with the phasing problem, we calculated the joint probability

distribution P R, Rp , φ, φp (see equation (7.3)). In Section 7.3, we showed

its extraordinary usefulness for optimization of some widely used crystallographic tools and phasing procedures; we refer in particular to the observed

Fourier synthesis via the use of the weight m, and to the difference Fourier

synthesis via the use of the Read coefficients, mE − σA Ep .

Let us now consider ρ, ρp , ρq ; these are the target, the model, and their

ideal difference structure; ρq = ρ − ρp has the property that, summed to ρp ,

it provides ρ, no matter what is the quality of ρp . Let R, Rp , Rq , φ, φp , φq be

the corresponding normalized diffraction amplitudes and phases. Would the

distribution,

P R, Rp , Rq , φ, φp , φq ,



(9.1)



be more useful than (7.3)? The hope is that including into the probabilistic

approach the additional variate Eq could lead to more accurate conditional

distributions, estimating phases given three rather than two magnitudes.

Distribution (9.1) (studied by Burla et al., 2010a) is the theoretical basis

of the VLD (vive la difference) algorithm; for the interested reader, some

details are quoted in Appendix 9.A.1, together with the conditional distributions which support VLD. The VLD algorithm (Burla et al., 2010b, 2011a,b) as

an ab initio phasing technique is described in Sections 9.3.1 to 9.3.2; its applications to ab initio phasing are summarized in Section 9.3.3. We delay until

Section 10.4 some VLD applications in combination with Patterson deconvolution techniques: VLD combination with molecular replacement is described

in Section 13.10.



9.3.1 The algorithm

Distribution (9.1) is practicable only if:

(i) measurement errors are included in the mathematical model;

(ii) the parameter σA (calculated between the model and the target structure)

is not unity.

Indeed, according to the definition of ρq , if condition (i) is violated, then

Fq = F − Fp is determined perfectly by the other two variates, and cannot

be introduced as a third variable in equation (9.1). If condition (ii) is violated

(i.e., σA = 1), then ρp ≡ ρ and ρq ≡ 0; then it is not necessary to calculate a

six-variate distribution (indeed, Fq will be identically equal to zero).



201



Tài liệu bạn tìm kiếm đã sẵn sàng tải về

APPENDIX 8. A SOLVENT CONTENT, ENVELOPE DEFINITION, AND SOLVENT MODELLING

Tải bản đầy đủ ngay(0 tr)

×